<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community</title>
    <description>The most recent home feed on DEV Community.</description>
    <link>https://dev.to</link>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed"/>
    <language>en</language>
    <item>
      <title>How Do You Design and Develop APIs the Git-Native Way?</title>
      <dc:creator>Hassann</dc:creator>
      <pubDate>Wed, 03 Jun 2026 06:41:59 +0000</pubDate>
      <link>https://dev.to/hassann/how-do-you-design-and-develop-apis-the-git-native-way-2691</link>
      <guid>https://dev.to/hassann/how-do-you-design-and-develop-apis-the-git-native-way-2691</guid>
      <description>&lt;p&gt;Most API teams treat the contract as an afterthought: write code, generate a spec, then watch the two drift apart. Git-native API design reverses that flow. You treat the API contract as source code, version it in Git, and review every change the same way you review application logic.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apidog.com/?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation" class="crayons-btn crayons-btn--primary"&gt;Try Apidog today&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;This guide focuses on implementation discipline, not a single tool. You’ll design contracts in branches, review them in pull requests, and turn a committed spec into mocks, tests, and docs. The goal is simple: your Git history should also be your API history.&lt;/p&gt;

&lt;p&gt;If you already know what Spec-First tooling looks like and want the product walkthrough, read the companion piece on the &lt;a href="http://apidog.com/blog/git-native-api-workflow?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation"&gt;git-native API workflow&lt;/a&gt;. This article stays focused on practice.&lt;/p&gt;

&lt;h2&gt;
  
  
  What “git-native” means for API work
&lt;/h2&gt;

&lt;p&gt;Git-native means your API definition lives in your repository as a plain text file. Not in a proprietary cloud database. Not behind a vendor login. A &lt;code&gt;.yaml&lt;/code&gt; or &lt;code&gt;.json&lt;/code&gt; file sits next to your code and is tracked by the same version control system your team already uses.&lt;/p&gt;

&lt;p&gt;In many cloud-locked API design tools, the contract lives in the vendor’s backend. You edit through a web UI, and your repository only contains an export. That export can become stale, and your Git history no longer explains how the API evolved.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxdo4ywtncq1n8h9r03h0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxdo4ywtncq1n8h9r03h0.png" alt="Git-native API workflow" width="800" height="478"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The git-native model inverts that relationship:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The file in &lt;code&gt;main&lt;/code&gt; is the contract.&lt;/li&gt;
&lt;li&gt;Any GUI is a view onto that file.&lt;/li&gt;
&lt;li&gt;Branches, commits, pull requests, blame, and rollback all apply to your API surface.&lt;/li&gt;
&lt;li&gt;Mocks, docs, tests, and generated clients derive from the committed spec.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A git-native setup has three core properties:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The spec is a text file in the repo.&lt;/li&gt;
&lt;li&gt;Changes flow through normal Git operations: branch, commit, PR, merge.&lt;/li&gt;
&lt;li&gt;Downstream artifacts derive from the committed file, not from a separate database.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Why design and develop APIs in Git
&lt;/h2&gt;

&lt;p&gt;You already trust Git with your code. Your API contract deserves the same treatment.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. History
&lt;/h3&gt;

&lt;p&gt;When someone asks, “When did we add the &lt;code&gt;cursor&lt;/code&gt; pagination parameter?”, Git answers directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git log &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; api/openapi.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The commit that introduced the change includes an author, date, message, and diff. No screenshots. No manual changelog archaeology.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Blame
&lt;/h3&gt;

&lt;p&gt;Use &lt;code&gt;git blame&lt;/code&gt; to find who changed a field and when:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git blame api/openapi.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A confusing field name can be traced back to the PR that added it, including the review discussion.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Rollback
&lt;/h3&gt;

&lt;p&gt;If a bad design ships, revert the merge:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git revert &amp;lt;merge-commit-sha&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The contract returns to its previous state. Codegen, mocks, docs, and tests regenerate from the reverted file.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Review
&lt;/h3&gt;

&lt;p&gt;A pull request is the right place to debate API design before implementation.&lt;/p&gt;

&lt;p&gt;Reviewers can comment on the exact &lt;code&gt;+&lt;/code&gt; line that adds a required field, changes a response shape, or introduces a new enum value. The design discussion stays attached to the change permanently.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Single source of truth
&lt;/h3&gt;

&lt;p&gt;When the contract is one file in &lt;code&gt;main&lt;/code&gt;, there is no ambiguity about which version is real. Frontend, backend, QA, and docs all read the same OpenAPI definition.&lt;/p&gt;

&lt;p&gt;That is the core value of a &lt;a href="http://apidog.com/blog/openapi-version-control-with-git?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation"&gt;git-based API specification workflow&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The git-native API design loop
&lt;/h2&gt;

&lt;p&gt;The loop has five steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Design the contract.&lt;/li&gt;
&lt;li&gt;Commit the change.&lt;/li&gt;
&lt;li&gt;Open a pull request.&lt;/li&gt;
&lt;li&gt;Review the API design.&lt;/li&gt;
&lt;li&gt;Merge, then implement.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Implementation follows the merged contract, not the other way around.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Create a branch
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git checkout &lt;span class="nt"&gt;-b&lt;/span&gt; feat/api-invoices-list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Edit the OpenAPI file
&lt;/h3&gt;

&lt;p&gt;Suppose you are adding an endpoint to fetch a user’s invoices.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# api/openapi.yaml&lt;/span&gt;
&lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="s"&gt;/users/{userId}/invoices&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;operationId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;listUserInvoices&lt;/span&gt;
      &lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;List invoices for a user&lt;/span&gt;
      &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;userId&lt;/span&gt;
          &lt;span class="na"&gt;in&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;path&lt;/span&gt;
          &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
          &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
            &lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;uuid&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;status&lt;/span&gt;
          &lt;span class="na"&gt;in&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;query&lt;/span&gt;
          &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
          &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
            &lt;span class="na"&gt;enum&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;draft&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;open&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;paid&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;void&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;responses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;200"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;A page of invoices&lt;/span&gt;
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;application/json&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;$ref&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#/components/schemas/InvoiceList"&lt;/span&gt;
      &lt;span class="err"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;404"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;User not found&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Commit the design change
&lt;/h3&gt;

&lt;p&gt;Keep the commit small and specific:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git add api/openapi.yaml
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"Add GET /users/{userId}/invoices contract"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Open a pull request
&lt;/h3&gt;

&lt;p&gt;The PR diff should show one logical design change:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One path&lt;/li&gt;
&lt;li&gt;One operation&lt;/li&gt;
&lt;li&gt;Two parameters&lt;/li&gt;
&lt;li&gt;Two responses&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reviewers can now discuss:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is &lt;code&gt;listUserInvoices&lt;/code&gt; the right &lt;code&gt;operationId&lt;/code&gt;?&lt;/li&gt;
&lt;li&gt;Should &lt;code&gt;status&lt;/code&gt; include all required states?&lt;/li&gt;
&lt;li&gt;Should this endpoint support pagination?&lt;/li&gt;
&lt;li&gt;Is &lt;code&gt;404&lt;/code&gt; correct for a missing user?&lt;/li&gt;
&lt;li&gt;Does the response schema match existing conventions?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 5: Merge, then implement
&lt;/h3&gt;

&lt;p&gt;After approval, merge the contract into &lt;code&gt;main&lt;/code&gt;. The implementation is then constrained by the agreed spec.&lt;/p&gt;

&lt;p&gt;This is the practical meaning of &lt;a href="http://apidog.com/blog/spec-first-api-development?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation"&gt;spec-first API development&lt;/a&gt;: the agreement comes before the code.&lt;/p&gt;

&lt;p&gt;The payoff is cost control. Changing a YAML field during review takes minutes. Changing a shipped, implemented, documented endpoint can take days.&lt;/p&gt;

&lt;h2&gt;
  
  
  Branching strategy for API contracts
&lt;/h2&gt;

&lt;p&gt;Treat contract changes like code changes: one branch per logical unit of work.&lt;/p&gt;

&lt;p&gt;Small branches keep diffs readable and make API review realistic.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Change type&lt;/th&gt;
&lt;th&gt;Branch prefix&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;th&gt;Review weight&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;New endpoint&lt;/td&gt;
&lt;td&gt;&lt;code&gt;feat/api-&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;feat/api-invoices-list&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Standard&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Additive field&lt;/td&gt;
&lt;td&gt;&lt;code&gt;feat/api-&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;feat/api-invoice-currency&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Light&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Breaking change&lt;/td&gt;
&lt;td&gt;&lt;code&gt;break/api-&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;break/api-remove-legacy-id&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Heavy, needs sign-off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bug fix in spec&lt;/td&gt;
&lt;td&gt;&lt;code&gt;fix/api-&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;fix/api-status-enum-typo&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Light&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Refactor only&lt;/td&gt;
&lt;td&gt;&lt;code&gt;chore/api-&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;chore/api-reorder-schemas&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Light&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The prefix communicates intent.&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;break/api-&lt;/code&gt; branch tells reviewers to slow down and check consumers. A &lt;code&gt;chore/api-&lt;/code&gt; branch signals no semantic API change, so review can move faster.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pick a branching model
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;th&gt;API tradeoff&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Trunk-based&lt;/td&gt;
&lt;td&gt;Continuous delivery, small teams&lt;/td&gt;
&lt;td&gt;Contract evolves in small steps; less merge pain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gitflow&lt;/td&gt;
&lt;td&gt;Scheduled releases, regulated shipping&lt;/td&gt;
&lt;td&gt;Spec diverges on &lt;code&gt;develop&lt;/code&gt;; bigger, riskier merges&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For most API teams, prefer trunk-based development:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Short-lived branches&lt;/li&gt;
&lt;li&gt;Small PRs&lt;/li&gt;
&lt;li&gt;Frequent merges into &lt;code&gt;main&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Less spec drift&lt;/li&gt;
&lt;li&gt;Fewer YAML merge conflicts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Long-lived branches are risky because two teams can restructure the same spec file and create painful conflicts. If that happens often, split the spec into multiple files with &lt;code&gt;$ref&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reviewing API design in pull requests
&lt;/h2&gt;

&lt;p&gt;A spec PR is a design review, not just a syntax check.&lt;/p&gt;

&lt;p&gt;Reviewers should focus on semantic impact.&lt;/p&gt;

&lt;h3&gt;
  
  
  Check for breaking changes
&lt;/h3&gt;

&lt;p&gt;Breaking changes include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Removing a field&lt;/li&gt;
&lt;li&gt;Renaming a path&lt;/li&gt;
&lt;li&gt;Changing a response type&lt;/li&gt;
&lt;li&gt;Making an optional field required&lt;/li&gt;
&lt;li&gt;Removing an enum value&lt;/li&gt;
&lt;li&gt;Tightening validation rules&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the change is breaking, require:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Explicit PR labeling&lt;/li&gt;
&lt;li&gt;API steward approval&lt;/li&gt;
&lt;li&gt;Version bump&lt;/li&gt;
&lt;li&gt;Migration or deprecation plan&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Check naming consistency
&lt;/h3&gt;

&lt;p&gt;Look for consistency with the existing API:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Are collection paths plural?&lt;/li&gt;
&lt;li&gt;Are path parameters named consistently?&lt;/li&gt;
&lt;li&gt;Do error responses use the same shape?&lt;/li&gt;
&lt;li&gt;Are enum values styled the same way?&lt;/li&gt;
&lt;li&gt;Does the &lt;code&gt;operationId&lt;/code&gt; follow your pattern?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Check diff readability
&lt;/h3&gt;

&lt;p&gt;Stable YAML makes review easier.&lt;/p&gt;

&lt;p&gt;Use consistent ordering for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;paths&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;HTTP methods&lt;/li&gt;
&lt;li&gt;&lt;code&gt;parameters&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;responses&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;components.schemas&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Avoid reformatting the whole file in the same PR as a semantic change. A five-line diff is reviewable. A 500-line reordered spec hides the real change.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example: safe enum addition
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight diff"&gt;&lt;code&gt; parameters:
   - name: status
     in: query
     schema:
       type: string
&lt;span class="gd"&gt;-      enum: [draft, open, paid, void]
&lt;/span&gt;&lt;span class="gi"&gt;+      enum: [draft, open, paid, void, uncollectible]
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This adds a new enum value, so it is usually additive.&lt;/p&gt;

&lt;p&gt;Compare that with removing &lt;code&gt;void&lt;/code&gt;, which would break any client that sends that value.&lt;/p&gt;

&lt;p&gt;Inline comments make this process concrete. Reviewers should comment on the spec line just like they comment on application code.&lt;/p&gt;

&lt;h2&gt;
  
  
  From design to development
&lt;/h2&gt;

&lt;p&gt;Once the contract is in &lt;code&gt;main&lt;/code&gt;, it becomes the input for everything downstream.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F88eliqp9p7y6r6vwzrcm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F88eliqp9p7y6r6vwzrcm.png" alt="Design to development workflow" width="800" height="459"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Generate code
&lt;/h3&gt;

&lt;p&gt;Use tools like &lt;code&gt;openapi-generator&lt;/code&gt; to generate server stubs or typed clients from the committed spec.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openapi-generator-cli generate &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-i&lt;/span&gt; api/openapi.yaml &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-g&lt;/span&gt; typescript-fetch &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-o&lt;/span&gt; generated/clients/typescript
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your application code fills in business logic, but request and response shapes come from the contract.&lt;/p&gt;

&lt;h3&gt;
  
  
  Generate mocks
&lt;/h3&gt;

&lt;p&gt;Run a mock server from the OpenAPI file so frontend developers can build before the backend is complete.&lt;/p&gt;

&lt;p&gt;The contract becomes usable immediately after merge.&lt;/p&gt;

&lt;h3&gt;
  
  
  Add contract tests
&lt;/h3&gt;

&lt;p&gt;Contract tests verify that the running server matches the committed spec.&lt;/p&gt;

&lt;p&gt;A typical flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start the API server in CI.&lt;/li&gt;
&lt;li&gt;Send real requests.&lt;/li&gt;
&lt;li&gt;Validate responses against &lt;code&gt;api/openapi.yaml&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Fail the build if the server and spec diverge.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This turns spec/code drift into a pipeline failure instead of a production bug.&lt;/p&gt;

&lt;h3&gt;
  
  
  Generate docs
&lt;/h3&gt;

&lt;p&gt;Reference docs should render from the same OpenAPI file.&lt;/p&gt;

&lt;p&gt;When the contract changes, docs change with it. No separate manual doc update should be required.&lt;/p&gt;

&lt;p&gt;The rule is simple: every API artifact should derive from the committed contract.&lt;/p&gt;

&lt;h2&gt;
  
  
  Team conventions that scale
&lt;/h2&gt;

&lt;p&gt;Conventions keep a git-native workflow manageable as the team grows.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Choose one spec file or many
&lt;/h3&gt;

&lt;p&gt;A single &lt;code&gt;openapi.yaml&lt;/code&gt; is simple and works well for smaller APIs.&lt;/p&gt;

&lt;p&gt;As the API grows, split the spec:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;api/
  openapi.yaml
  paths/
    users.yaml
    invoices.yaml
  schemas/
    user.yaml
    invoice.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use &lt;code&gt;$ref&lt;/code&gt; to connect files and bundle them in CI.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Version deliberately
&lt;/h3&gt;

&lt;p&gt;Update &lt;code&gt;info.version&lt;/code&gt; for meaningful contract changes.&lt;/p&gt;

&lt;p&gt;A practical versioning convention:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Additive change: minor version bump&lt;/li&gt;
&lt;li&gt;Bug fix or documentation correction: patch version bump&lt;/li&gt;
&lt;li&gt;Breaking change: major version bump&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Breaking changes often require a new path prefix such as &lt;code&gt;/v2/&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Keep a changelog
&lt;/h3&gt;

&lt;p&gt;Place &lt;code&gt;CHANGELOG.md&lt;/code&gt; next to the spec.&lt;/p&gt;

&lt;p&gt;Git history is precise, but a changelog is easier for API consumers to scan.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# API Changelog&lt;/span&gt;

&lt;span class="gu"&gt;## 2.1.0&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Added &lt;span class="sb"&gt;`GET /users/{userId}/invoices`&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Added &lt;span class="sb"&gt;`uncollectible`&lt;/span&gt; invoice status

&lt;span class="gu"&gt;## 2.0.0&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Removed legacy &lt;span class="sb"&gt;`customer_id`&lt;/span&gt; field
&lt;span class="p"&gt;-&lt;/span&gt; Introduced &lt;span class="sb"&gt;`/v2`&lt;/span&gt; invoice endpoints
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Protect the spec with CODEOWNERS
&lt;/h3&gt;

&lt;p&gt;Require API stewards to approve contract changes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# .github/CODEOWNERS
/api/openapi.yaml @api-stewards
/api/paths/ @api-stewards
/api/schemas/ @api-stewards
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This prevents inconsistent changes from slipping into the contract.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Lint in CI
&lt;/h3&gt;

&lt;p&gt;Use a linter to catch style and consistency issues before human review.&lt;/p&gt;

&lt;p&gt;Example GitHub Actions workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/api-lint.yml&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;API Lint&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;api/**"&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;spectral&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run Spectral&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npx @stoplight/spectral-cli lint api/openapi.yaml --fail-severity warn&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With linting plus &lt;code&gt;CODEOWNERS&lt;/code&gt;, each contract change gets automated checks and human review.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common pitfalls and how to avoid them
&lt;/h2&gt;

&lt;p&gt;Git-native API design has predictable failure modes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pitfall 1: Spec/code drift
&lt;/h3&gt;

&lt;p&gt;The spec says one thing. The running server does another.&lt;/p&gt;

&lt;p&gt;Avoid it with contract tests in CI. Validate live responses against the committed spec and fail the build on divergence.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pitfall 2: Giant PRs
&lt;/h3&gt;

&lt;p&gt;A branch that adds twenty endpoints is hard to review.&lt;/p&gt;

&lt;p&gt;Avoid it by splitting API work into small PRs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One endpoint&lt;/li&gt;
&lt;li&gt;One schema change&lt;/li&gt;
&lt;li&gt;One behavior change&lt;/li&gt;
&lt;li&gt;One breaking change proposal&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Small diffs get real review.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pitfall 3: Hand-written artifacts
&lt;/h3&gt;

&lt;p&gt;Hand-written clients, docs, or mocks can silently drift from the spec.&lt;/p&gt;

&lt;p&gt;Avoid it by generating artifacts from the committed OpenAPI file every time.&lt;/p&gt;

&lt;p&gt;Treat hand-written API artifacts as a smell.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pitfall 4: YAML merge conflicts
&lt;/h3&gt;

&lt;p&gt;Long-lived branches and large spec files create painful merge conflicts.&lt;/p&gt;

&lt;p&gt;Avoid them with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Short-lived branches&lt;/li&gt;
&lt;li&gt;Stable key ordering&lt;/li&gt;
&lt;li&gt;Split-file specs&lt;/li&gt;
&lt;li&gt;Trunk-based development&lt;/li&gt;
&lt;li&gt;Small PRs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern is consistent: keep changes small, generate from the spec, and let CI enforce the contract.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Apidog fits
&lt;/h2&gt;

&lt;p&gt;You can run a git-native workflow with a text editor and a CLI. Many teams, however, want a visual design surface without giving up Git as the source of truth.&lt;/p&gt;

&lt;p&gt;That is the gap Apidog’s Spec-First Mode fills.&lt;/p&gt;

&lt;p&gt;Spec-First Mode keeps the OpenAPI file in your Git repository and supports two-way sync. You can edit the contract in Apidog’s visual designer or in your editor, while the file in Git remains canonical. Branches, PRs, and history still work as described above.&lt;/p&gt;

&lt;p&gt;See the &lt;a href="https://docs.apidog.com/spec-first-mode-beta-2058268m0?utm_source=dev.to&amp;amp;utm_medium=wanda&amp;amp;utm_content=n8n-post-automation"&gt;Spec-First Mode documentation&lt;/a&gt; for setup details.&lt;/p&gt;

&lt;p&gt;The point is not to replace Git. The point is to add a GUI while keeping the repository as the single source of truth.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is git-native API design only for OpenAPI?
&lt;/h3&gt;

&lt;p&gt;No. The discipline applies to any text-based contract format.&lt;/p&gt;

&lt;p&gt;OpenAPI is common, but the same workflow works for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AsyncAPI&lt;/li&gt;
&lt;li&gt;gRPC &lt;code&gt;.proto&lt;/code&gt; files&lt;/li&gt;
&lt;li&gt;GraphQL SDL&lt;/li&gt;
&lt;li&gt;JSON Schema&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the contract is a text file you can diff, branch, and review, it can be git-native.&lt;/p&gt;

&lt;h3&gt;
  
  
  How do I handle breaking changes in a git-native workflow?
&lt;/h3&gt;

&lt;p&gt;Make breaking changes visible and deliberate.&lt;/p&gt;

&lt;p&gt;Use a &lt;code&gt;break/api-&lt;/code&gt; branch prefix, bump the major version, and require steward approval through &lt;code&gt;CODEOWNERS&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Where possible, add the new shape alongside the old one and deprecate the old path on a timeline. The PR diff and version bump should clearly signal the break to consumers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Should the API spec live in the same repo as the code?
&lt;/h3&gt;

&lt;p&gt;Usually yes, if one team owns both the API and implementation.&lt;/p&gt;

&lt;p&gt;Co-locating the spec and code means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One PR can update contract and handler together.&lt;/li&gt;
&lt;li&gt;Contract tests run in one pipeline.&lt;/li&gt;
&lt;li&gt;Reviewers can see implementation impact.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use a separate spec repo only when many teams consume one shared API and need independent versioning.&lt;/p&gt;

&lt;h3&gt;
  
  
  How do I prevent spec and code from drifting apart?
&lt;/h3&gt;

&lt;p&gt;Add contract tests to CI.&lt;/p&gt;

&lt;p&gt;They should send real requests to your running server and validate responses against the committed spec. If the server and spec diverge, the build fails.&lt;/p&gt;

&lt;p&gt;Combine that with generated stubs, clients, mocks, and docs to keep the whole API workflow aligned.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Git-native API design is a discipline, not a product. You treat the contract as source code, evolve it in branches, review it in pull requests, and generate downstream artifacts from the committed file.&lt;/p&gt;

&lt;p&gt;Start small:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Move your spec into the repo.&lt;/li&gt;
&lt;li&gt;Add API linting in CI.&lt;/li&gt;
&lt;li&gt;Protect the spec with &lt;code&gt;CODEOWNERS&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Review contract changes in PRs.&lt;/li&gt;
&lt;li&gt;Generate clients, mocks, docs, and tests from the spec.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The workflow compounds. Each convention makes the next one easier, and your Git history becomes a complete record of how your API grew.&lt;/p&gt;

&lt;p&gt;If you want a visual design surface that keeps the spec in Git, try Spec-First Mode in Apidog and see how two-way sync fits the workflow above.&lt;/p&gt;

</description>
      <category>api</category>
      <category>architecture</category>
      <category>git</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>The Container Port Binding Mistake That Breaks Almost Every First Deploy</title>
      <dc:creator>Thomas Plat</dc:creator>
      <pubDate>Wed, 03 Jun 2026 06:40:17 +0000</pubDate>
      <link>https://dev.to/thpl/the-container-port-binding-mistake-that-breaks-almost-every-first-deploy-3fng</link>
      <guid>https://dev.to/thpl/the-container-port-binding-mistake-that-breaks-almost-every-first-deploy-3fng</guid>
      <description>&lt;p&gt;You deploy your app. The build succeeds. The logs show the server starting. You click the URL your deployment platform gave you and get a connection error, a 502, or nothing at all.&lt;/p&gt;

&lt;p&gt;This is one of the most common first deployment failures, and the cause is almost always the same: the app is binding to the wrong address, or listening on the wrong port, or both.&lt;/p&gt;

&lt;h2&gt;
  
  
  What port binding actually means
&lt;/h2&gt;

&lt;p&gt;When a server app starts, it listens for incoming connections on a network address. That address has two parts: the IP address it listens on, and the port number.&lt;/p&gt;

&lt;p&gt;The IP address determines which network interfaces the application accepts connections from. &lt;code&gt;localhost&lt;/code&gt; (which resolves to &lt;code&gt;127.0.0.1&lt;/code&gt;) means the app only accepts connections from the same machine. &lt;code&gt;0.0.0.0&lt;/code&gt; means the app accepts connections from any network interface, including external ones.&lt;/p&gt;

&lt;p&gt;During local development, &lt;code&gt;localhost&lt;/code&gt; is fine. Everything is on the same machine. Your browser and your server are both on your laptop. When you deploy to a server, the platform's load balancer is not on the same machine as your app. It is trying to connect from outside. An app bound to &lt;code&gt;localhost&lt;/code&gt; is invisible to it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The port problem
&lt;/h2&gt;

&lt;p&gt;Deployment platforms often assign ports dynamically. They tell your app which port to use through an environment variable, almost always called &lt;code&gt;PORT&lt;/code&gt;. Your app needs to read this variable and bind to that port.&lt;/p&gt;

&lt;p&gt;If your app ignores &lt;code&gt;PORT&lt;/code&gt; and hardcodes a port number, it starts on a port the platform is not watching. The platform tries to connect on its assigned port, gets nothing, and marks the deployment as failed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// This will fail on most platforms&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// This is correct&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PORT&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;|| 3000&lt;/code&gt; fallback makes the app work both locally (where &lt;code&gt;PORT&lt;/code&gt; is not set) and in production (where the platform sets it).&lt;/p&gt;

&lt;h2&gt;
  
  
  What AI tools generate
&lt;/h2&gt;

&lt;p&gt;AI tools often hardcode both the address and the port. The generated code looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;localhost&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Server running on http://localhost:3000&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is correct for local development and wrong for production. The &lt;code&gt;'localhost'&lt;/code&gt; argument is the binding address. Remove it entirely or replace it with &lt;code&gt;'0.0.0.0'&lt;/code&gt;. Replace &lt;code&gt;3000&lt;/code&gt; with &lt;code&gt;process.env.PORT&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The same pattern appears across multiple frameworks. Express does it. Fastify does it. Hapi does it. The underlying behavior is the same in all of them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Framework-specific fixes
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Express:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PORT&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;port&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;0.0.0.0&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Fastify:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;fastify&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PORT&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;0.0.0.0&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;AdonisJS:&lt;/strong&gt; Set &lt;code&gt;HOST=0.0.0.0&lt;/code&gt; and &lt;code&gt;PORT&lt;/code&gt; in your environment. AdonisJS reads both from environment variables automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Next.js:&lt;/strong&gt; Next.js handles port binding correctly by default and reads &lt;code&gt;PORT&lt;/code&gt; from the environment. No manual fix needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;NestJS:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PORT&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;0.0.0.0&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How to diagnose it
&lt;/h2&gt;

&lt;p&gt;If your deployment shows the application is starting but the health check is failing, check two things in the startup logs:&lt;/p&gt;

&lt;p&gt;First, what address is the server logging? If you see &lt;code&gt;Listening on http://localhost:3000&lt;/code&gt; or &lt;code&gt;Server running on 127.0.0.1:3000&lt;/code&gt;, the app is bound to localhost. External traffic cannot reach it.&lt;/p&gt;

&lt;p&gt;Second, what port is the app using? If it is hardcoded and does not match the &lt;code&gt;PORT&lt;/code&gt; environment variable, the platform is sending traffic to the wrong port.&lt;/p&gt;

&lt;p&gt;Both of these are visible in the startup log lines that most frameworks print when they start successfully.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is so consistent
&lt;/h2&gt;

&lt;p&gt;This failure is nearly universal for first deployments because it is invisible during development. The hardcoded localhost binding works perfectly when you are testing locally. Nothing ever fails. The code ships, the app starts on the server, and the binding address becomes a problem for the first time.&lt;/p&gt;

&lt;p&gt;Catching this before deployment is one of the more valuable things a deployment platform can do automatically. jetpacked.ai detects hardcoded port and address bindings during repo analysis — apps that don't listen on 0.0.0.0 or ignore PORT won't serve traffic, and surfacing that before the build starts saves the debugging loop entirely.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>AI Native DevCon Day 2: From Agent Demos to Operating Models</title>
      <dc:creator>Rohan Sharma</dc:creator>
      <pubDate>Wed, 03 Jun 2026 06:40:09 +0000</pubDate>
      <link>https://dev.to/tessl/ai-native-devcon-day-2-from-agent-demos-to-operating-models-51hf</link>
      <guid>https://dev.to/tessl/ai-native-devcon-day-2-from-agent-demos-to-operating-models-51hf</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Day 2 of AI Native DevCon shifted from agent capability to operating discipline. The strongest sessions focused on how teams can run AI-native delivery with clearer context pipelines, measurable agent behavior, safer execution boundaries, and better organizational ownership.&lt;/p&gt;

&lt;p&gt;The scale showed up in the numbers too. Across the two days, DevCon brought together 650+ in-person registrations, around 2,000 online registrations, and a packed mix of sessions, workshops, hallway conversations, and practical lessons.&lt;/p&gt;

&lt;p&gt;Day 2 leaned into workshops. That shift mattered because the second day was less about proving agents can do useful work and more about showing how teams can make that work repeatable.&lt;/p&gt;

&lt;p&gt;Hey there, welcome back. &lt;a href="https://www.linkedin.com/in/rohan-sharma-9386rs/" rel="noopener noreferrer"&gt;Rohan Sharma&lt;/a&gt; here again continuing the devcon series.&lt;/p&gt;

&lt;p&gt;Day 1 gave us the framing, including &lt;a href="https://www.linkedin.com/in/guypo/" rel="noopener noreferrer"&gt;Guy Podjarny&lt;/a&gt;’s core point that skills should be treated like real software assets. Day 2 picked up from there and moved into the operating details. Once agents are inside daily engineering work, platform and product teams need to decide what changes first, who owns those changes, and how the results are measured.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7tmgtur5904c1xg2ty6c.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7tmgtur5904c1xg2ty6c.jpg" alt="day1" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Talks that shaped Day 2
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Harness engineering beyond code
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.linkedin.com/in/marcsloan/" rel="noopener noreferrer"&gt;Marc Sloan&lt;/a&gt; from Tessl focused on the next gap many teams are hitting. Code context is increasingly structured, but product and design context still lives in external systems such as Figma, Notion, and Linear. Pulling that context live can reduce staleness, but it introduces drift in evals, versioning, and reproducibility.&lt;/p&gt;

&lt;p&gt;The practical lesson was to stop treating external product and design context as random reference material. Teams need a defined layer between the repository and those external systems, with clear versioning so evaluations can be replayed against known context snapshots.&lt;/p&gt;

&lt;p&gt;Without that, agents can produce work that looks technically correct while missing the product constraint that actually mattered. That is a very expensive kind of almost-right.&lt;/p&gt;

&lt;h3&gt;
  
  
  From vibes to metrics
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.linkedin.com/in/simonobstbaum/" rel="noopener noreferrer"&gt;Simon Obstbaum&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/robertgwilloughby/" rel="noopener noreferrer"&gt;Rob Willoughby&lt;/a&gt; from Tessl delivered a session focused on a challenge many engineering leaders are currently facing. Their distinction between output evals and trajectory evals is operationally important. A good answer is not enough if the agent used risky tools, skipped required checks, or ignored policy steps.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fypw3qaky2rov21ea2q8j.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fypw3qaky2rov21ea2q8j.jpg" alt="rob" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The useful measurement model came down to activation, trajectory, and outcome. Did the right skill trigger? Did the agent follow the right steps? Was the final result actually useful and correct?&lt;/p&gt;

&lt;p&gt;The good part was the emphasis on partial compliance. Pass or fail is too blunt for agent workflows. If a workflow degrades halfway through, teams need to know where it happened, not just that something felt off.&lt;/p&gt;

&lt;h3&gt;
  
  
  Benchmarking beyond the model
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://uk.linkedin.com/in/amit-kushwaha28" rel="noopener noreferrer"&gt;Amit Kushwaha&lt;/a&gt; highlighted why many current benchmarks miss real agent behavior. Agent systems run long traces with tool calls, context accumulation, and latency bottlenecks that one-shot benchmark numbers do not capture.&lt;/p&gt;

&lt;p&gt;For teams choosing infrastructure, the warning was clear. Do not optimize only for model speed. Real agent workloads involve tools, memory, caches, retries, and long-running traces.&lt;/p&gt;

&lt;p&gt;The better benchmark is closer to production reality, with multi-turn tasks, tool latency, tail latency, and cache behavior over time. Otherwise teams risk picking systems that look great in a chart and struggle in the actual workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Safe execution boundaries for agents
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.linkedin.com/in/shelajev/" rel="noopener noreferrer"&gt;Oleg Šelajev&lt;/a&gt; from Docker covered a problem every platform team eventually sees. An unconstrained agent can make high-impact changes in the wrong environment. Sandboxing is not optional once agents are allowed to execute.&lt;/p&gt;

&lt;p&gt;The practical takeaway was to treat environment policy as part of the harness. Filesystem access, network access, secrets, and permissions all need clear boundaries before agents are given the ability to act.&lt;/p&gt;

&lt;p&gt;This is how teams lower blast radius. Not by hoping the agent behaves nicely, but by designing the room it is allowed to move around in.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do not write prompts, write software
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.linkedin.com/in/jbaruch" rel="noopener noreferrer"&gt;Baruch Sadogursky&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/maceybaker/" rel="noopener noreferrer"&gt;Macey Baker&lt;/a&gt; from Tessl reinforced an idea that keeps proving useful in production. Break behavior into modular skills instead of maintaining one giant prompt. This makes agent behavior easier to test, review, and reuse.&lt;/p&gt;

&lt;p&gt;The message was not “write a better mega prompt.” It was to turn repeatable behavior into composable skills that match real workflow stages. That gives teams something they can review, test, improve, and share across repos.&lt;/p&gt;

&lt;p&gt;If you try one thing from this workshop, use the materials and skill templates as a starting point. Prototype one small skill pipeline in your own environment before trying to scale the pattern across every repo.&lt;/p&gt;

&lt;h2&gt;
  
  
  What kept coming up across the day
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Context quality is now a platform responsibility
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.linkedin.com/in/marcsloan/" rel="noopener noreferrer"&gt;Marc Sloan&lt;/a&gt;, &lt;a href="https://www.linkedin.com/in/smithshaun/" rel="noopener noreferrer"&gt;Shaun Smith&lt;/a&gt;, and &lt;a href="https://www.linkedin.com/in/john-groetzinger/" rel="noopener noreferrer"&gt;John Groetzinger&lt;/a&gt; approached this from different angles, but the operational message was consistent. Context delivery is becoming an engineering system, not documentation hygiene. Teams need predictable context pipelines for both humans and agents.&lt;/p&gt;

&lt;p&gt;The next step is ownership. Teams need to know who maintains context sources, how often they refresh, and how changes are versioned. Context also needs observability so teams can trace which inputs shaped an agent decision.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Agent performance needs production-grade telemetry
&lt;/h3&gt;

&lt;p&gt;The sessions from &lt;a href="https://www.linkedin.com/in/simonobstbaum/" rel="noopener noreferrer"&gt;Simon Obstbaum&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/robertgwilloughby/" rel="noopener noreferrer"&gt;Rob Willoughby&lt;/a&gt; from Tessl, plus &lt;a href="https://uk.linkedin.com/in/amit-kushwaha28" rel="noopener noreferrer"&gt;Amit Kushwaha&lt;/a&gt; from NVIDIA and &lt;a href="https://www.linkedin.com/in/justincormack/" rel="noopener noreferrer"&gt;Justin Cormack&lt;/a&gt;, former CTO at Docker, made this very concrete. Teams need to measure how agents worked, not only what they returned.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F36vlk57fuoml49x7hh5c.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F36vlk57fuoml49x7hh5c.jpg" alt="justin" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Trajectory metrics belong next to existing quality signals. If your dashboards already show test health, release health, or incident trends, agent workflow quality should sit in the same operational view.&lt;/p&gt;

&lt;p&gt;The benchmark scenarios should also look like real work. Multi-turn, tool-heavy, slightly messy, and full of the same constraints your teams face every day. Justin’s observability point connected neatly here too. Teams need runtime signals that can reveal agent-induced drift before it becomes a bigger production problem.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Adoption is an organizational design problem, not a tooling checkbox
&lt;/h3&gt;

&lt;p&gt;Talks from &lt;a href="https://www.linkedin.com/in/tammuzdubnov/" rel="noopener noreferrer"&gt;Tammuz Dubnov&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/birgittaboeckeler/" rel="noopener noreferrer"&gt;Birgitta Böckeler&lt;/a&gt; from Thoughtworks showed that adoption succeeds when review structures, ownership boundaries, and team rituals evolve with the tooling.&lt;/p&gt;

&lt;p&gt;That means setting explicit contribution boundaries for AI-assisted changes and updating review criteria. The diff still matters, but so does the path the agent took to produce it. Birgitta’s adoption data made this especially grounded by showing where hidden costs appear, including review load, technical debt, and maintainability when speed becomes the only metric.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Workshops made the ideas practical
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.linkedin.com/in/jbaruch" rel="noopener noreferrer"&gt;Baruch Sadogursky&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/maceybaker/" rel="noopener noreferrer"&gt;Macey Baker&lt;/a&gt; from Tessl, along with &lt;a href="https://www.linkedin.com/in/alfonso-graziano/" rel="noopener noreferrer"&gt;Alfonso Graziano&lt;/a&gt; from Nearform, helped turn the bigger Day 2 ideas into something teams could actually try. The workshop-heavy format made the day feel less like theory and more like practice.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.linkedin.com/in/derekashmore/" rel="noopener noreferrer"&gt;Derek Ashmore&lt;/a&gt;’s packed workshop, &lt;strong&gt;“The AI Agent Testing Pyramid,”&lt;/strong&gt; focused on the different levels of testing agent systems need. For those following from home, you can attempt it on your own by following &lt;a href="https://github.com/AsperitasConsulting/research-summarizer-agent" rel="noopener noreferrer"&gt;this repo&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmxs4pmkc5rz7jh3wy667.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmxs4pmkc5rz7jh3wy667.jpg" alt="derek" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.linkedin.com/in/lamis-mukta/" rel="noopener noreferrer"&gt;Aashrey Tiku&lt;/a&gt; from Anthropic worked through a hands-on session on shipping a managed agent. It was a useful bridge between agent concepts and the practical work of packaging, managing, and operating an agent with the right boundaries.&lt;/p&gt;

&lt;p&gt;That mattered because AI-native development is still new enough that people need patterns they can test, not just concepts they can nod along to. Alfonso’s spec-driven angle fit well here because prompts become far more useful when they are turned into testable, production-ready specifications.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Agent enablement needs real ownership
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.linkedin.com/in/anatomic/" rel="noopener noreferrer"&gt;Ian Thomas&lt;/a&gt; from Meta and &lt;a href="https://www.linkedin.com/in/katie-roberts-3bbb2316/" rel="noopener noreferrer"&gt;Katie Roberts&lt;/a&gt; from Nearform made the enablement side feel practical. Rollouts work better when platform safeguards are paired with updated team rituals, clear ownership, and realistic guidance for brownfield systems.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffpu0182z5excdycdvlnl.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffpu0182z5excdycdvlnl.jpg" alt="ian" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Katie’s legacy advice was especially useful. AI should help teams modernize incrementally, not generate another fragile layer on top of systems that are already hard to maintain.&lt;/p&gt;

&lt;h2&gt;
  
  
  If you missed Day 1, &lt;a href="https://www.youtube.com/watch?v=akZ85mG5HXY" rel="noopener noreferrer"&gt;start here&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;Day 2 was workshop-heavy. If you missed the &lt;a href="https://www.youtube.com/watch?v=akZ85mG5HXY" rel="noopener noreferrer"&gt;Day 1 virtual stream&lt;/a&gt;, start with these talks before digging into the workshop themes.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;&lt;a href="https://www.linkedin.com/in/guypo/" rel="noopener noreferrer"&gt;Guy Podjarny&lt;/a&gt;, Tessl&lt;/strong&gt; - Skills are the new Code&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;&lt;a href="https://www.linkedin.com/in/dglawson" rel="noopener noreferrer"&gt;Dana Lawson&lt;/a&gt;, Netlify&lt;/strong&gt; - Built for Humans. Now Agents Are Here.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;&lt;a href="https://www.linkedin.com/in/jimbomoss/" rel="noopener noreferrer"&gt;James Moss&lt;/a&gt;, Tessl&lt;/strong&gt; - Using skills to pay the bills&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;&lt;a href="https://www.linkedin.com/in/talliran/" rel="noopener noreferrer"&gt;Liran Tal&lt;/a&gt;, Snyk&lt;/strong&gt; - Your AI Agent Installed Malware Because a SKILL.md Told It To&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;&lt;a href="https://www.linkedin.com/in/ryanlopopolo/?_l=en_US" rel="noopener noreferrer"&gt;Ryan Lopopolo&lt;/a&gt;, OpenAI&lt;/strong&gt; - Harness Engineering&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;&lt;a href="https://be.linkedin.com/in/patrickdebois" rel="noopener noreferrer"&gt;Patrick Debois&lt;/a&gt;, Tessl&lt;/strong&gt; - The Rise of Agent Enablement&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;&lt;a href="https://www.linkedin.com/in/shachar-azriel-215748127/" rel="noopener noreferrer"&gt;Shachar Azriel&lt;/a&gt;, Baz&lt;/strong&gt; - Executable Specs&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;&lt;a href="https://www.linkedin.com/in/may-walterr/" rel="noopener noreferrer"&gt;May Walter&lt;/a&gt;, Hud&lt;/strong&gt; - Runtime Intelligence for Continuous Agentic Performance Optimization&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://www.linkedin.com/in/dave-farley-a67927" rel="noopener noreferrer"&gt;&lt;strong&gt;Dave Farley&lt;/strong&gt;&lt;/a&gt; - Vibe Coding: Is this really the best we can do?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That set gives the right foundation for Day 2 across skills, context, verification, security, harnesses, runtime feedback, and team enablement.&lt;/p&gt;

&lt;h2&gt;
  
  
  AI Native DevCon is not over yet!
&lt;/h2&gt;

&lt;p&gt;We are already working on the next AI DevCon, and yes, we are very excited to say that AI DevCon NYC is officially on the way.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4rcrewhayjs1otwkikvh.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4rcrewhayjs1otwkikvh.jpg" alt="devcon nyc" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If Day 1 gave the frame and Day 2 showed the operating model, NYC is where the conversation gets even more practical. Expect more on skills, harnesses, agent safety, context systems, benchmarking, product workflows, and what it really takes to make AI-native delivery work inside teams.&lt;/p&gt;

&lt;p&gt;Super-early-bird seats are available now. If you want to be in the room for the next round of conversations, this is the time to grab a spot.&lt;/p&gt;

&lt;p&gt;In the meantime, &lt;a href="https://tessl.io/newsletter/" rel="noopener noreferrer"&gt;register for the AI DevCon newsletter&lt;/a&gt;. We will release the content shared over the conference, including selected highlights, session clips, notes, slide decks, and workshop materials as they are published.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>security</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Building an Edge REST API with Hono.js + TypeScript — From Bun Local Server to Cloudflare Workers</title>
      <dc:creator>Jangwook Kim</dc:creator>
      <pubDate>Wed, 03 Jun 2026 06:40:03 +0000</pubDate>
      <link>https://dev.to/jangwook_kim_e31e7291ad98/building-an-edge-rest-api-with-honojs-typescript-from-bun-local-server-to-cloudflare-workers-4b4m</link>
      <guid>https://dev.to/jangwook_kim_e31e7291ad98/building-an-edge-rest-api-with-honojs-typescript-from-bun-local-server-to-cloudflare-workers-4b4m</guid>
      <description>&lt;p&gt;If you've ever built a REST API with Express, you've probably felt it. Middleware registration, type definitions, body parser setup, connecting Joi or Zod... the structure is simple, but the boilerplate is excessive. When I first saw Hono, I was skeptical. "Another Express clone," I thought. That changed when I actually ran it.&lt;/p&gt;

&lt;p&gt;Bottom line: Hono v4 is more than just lightweight and fast. TypeScript type inference flows naturally all the way to route handlers. Zod validation connects via a single official package. On Bun, response times are noticeably faster than Express. Everything in this post is based on what I ran in a sandbox in June 2026.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Hono — Compared to Express and Fastify
&lt;/h2&gt;

&lt;p&gt;Understanding where Hono fits means answering three questions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bundle size&lt;/strong&gt;: Hono v4 core is about 12KB. Express is 58KB, Fastify is 77KB. The gap might not sound dramatic, but in edge environments like Cloudflare Workers or Deno Deploy, bundle size directly affects cold start time. Edge functions sometimes initialize a new runtime per request — smaller means faster first response.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Runtime compatibility&lt;/strong&gt;: Express is Node.js-only. Fastify targets Node.js by default. Hono was designed from the start to "run anywhere." The same code deploys to Bun, Deno, Cloudflare Workers, Node.js, and AWS Lambda Edge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TypeScript support&lt;/strong&gt;: Express requires &lt;code&gt;@types/express&lt;/code&gt; as a separate install, and properties added to &lt;code&gt;req&lt;/code&gt; via middleware don't get type inference. Hono is written in TypeScript from the ground up, and the &lt;code&gt;Hono&amp;lt;{ Bindings: Env; Variables: Variables }&amp;gt;&lt;/code&gt; generic gives you type-safe access to environment variables and middleware state.&lt;/p&gt;

&lt;p&gt;I'm not saying Hono is the right choice for every situation. If your team is deeply invested in Express, or you need a mature plugin ecosystem, there's no compelling reason to switch. But if edge deployment is the goal, or you want type safety from day one, Hono is the most convincing TypeScript API framework right now.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation and First Server — Response in 30 Seconds
&lt;/h2&gt;

&lt;p&gt;I started from scratch in a sandbox. Bun 1.3.14.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Initialize a new project&lt;/span&gt;
bun init &lt;span class="nt"&gt;-y&lt;/span&gt;

&lt;span class="c"&gt;# Install Hono v4&lt;/span&gt;
bun add hono

&lt;span class="c"&gt;# Add Zod validation packages&lt;/span&gt;
bun add zod @hono/zod-validator
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;bun add v1.3.14 (0d9b296a)
installed hono@4.12.23
installed @hono/zod-validator@0.8.0
installed zod@4.4.3
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install time was under 500ms. Hono's dependency chain is nearly empty.&lt;/p&gt;

&lt;p&gt;The simplest possible server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// index.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Hono&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hono&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Hono&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Hello from Hono!&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}))&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bun run index.ts
&lt;span class="c"&gt;# Started development server: http://localhost:3000&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:3000/
&lt;span class="c"&gt;# {"message":"Hello from Hono!"}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;export default app&lt;/code&gt; — that single line is recognized as the entry point for Bun, Deno, and Cloudflare Workers alike. For Node.js, add &lt;code&gt;serve(app)&lt;/code&gt; and you're done. No runtime-branching code needed. That felt like the biggest quality-of-life win.&lt;/p&gt;

&lt;h2&gt;
  
  
  Middleware Stack — logger, CORS, timing
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/..%2F..%2F..%2Fassets%2Fblog%2Fhono-typescript-api-2026%2Fhono-typescript-api-2026-arch.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/..%2F..%2F..%2Fassets%2Fblog%2Fhono-typescript-api-2026%2Fhono-typescript-api-2026-arch.png" alt="Hono Middleware Stack Architecture"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Hono imports built-in middleware via &lt;code&gt;hono/middleware-name&lt;/code&gt;. You only pull in what you use, so nothing extra ends up in the bundle.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Hono&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hono&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;logger&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hono/logger&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;cors&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hono/cors&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;timing&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hono/timing&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Hono&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;// Registration order equals execution order&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;*&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;*&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;cors&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;*&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;timing&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With &lt;code&gt;logger()&lt;/code&gt;, each request prints:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;&amp;lt;-- GET /tasks
--&amp;gt; GET /tasks 200 0ms
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When I ran this, the response speed was obvious. First request: 3ms. Subsequent requests: 0ms server-side (sub-millisecond). With &lt;code&gt;timing()&lt;/code&gt;, the &lt;code&gt;Server-Timing&lt;/code&gt; header is added to responses, so you can see per-stage timing in Chrome DevTools Network tab.&lt;/p&gt;

&lt;p&gt;CORS takes fine-grained options:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;*&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;cors&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://jangwook.net&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://localhost:5173&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;allowMethods&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GET&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PATCH&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DELETE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;allowHeaders&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Authorization&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;}))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;cors()&lt;/code&gt; default allows all origins. In production, always specify &lt;code&gt;origin&lt;/code&gt; explicitly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Zod Validation — Automatic 400 Errors
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;@hono/zod-validator&lt;/code&gt; is Hono's official Zod integration. Drop it in as middleware on a route, and any Zod schema validation failure automatically returns a 400.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;zValidator&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@hono/zod-validator&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zod&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;createTaskSchema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Title is required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Max 100 characters&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;optional&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/tasks&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;zValidator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;createTaskSchema&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;valid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="c1"&gt;// body is typed as z.infer&amp;lt;typeof createTaskSchema&amp;gt;&lt;/span&gt;
  &lt;span class="c1"&gt;// body.title is string, body.completed is boolean — no undefined&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;nextId&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;task&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;201&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Test run with an empty title:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:3000/tasks &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"title":""}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"success"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ZodError"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"[{&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;code&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;too_small&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;,&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;minimum&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:1,&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;path&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;title&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;],&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;message&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;Title is required&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;}]"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;HTTP 400, automatically. No validation code needed inside the handler.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;c.req.valid('json')&lt;/code&gt; is the key. What comes back is already Zod-validated and fully typed. If you've worked with &lt;a href="https://dev.to/en/blog/en/typescript-zod-v4-claude-api-structured-output-guide-2026"&gt;Zod v4 and Claude API structured output&lt;/a&gt;, the v4 schema API changes apply here too — &lt;code&gt;@hono/zod-validator&lt;/code&gt; supports both v3 and v4.&lt;/p&gt;

&lt;h2&gt;
  
  
  Full CRUD Implementation — With Real Execution Logs
&lt;/h2&gt;

&lt;p&gt;Here's the complete Task CRUD API, with the actual terminal output from running it. In-memory storage for this example (swap in D1, Prisma, or Drizzle for production).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Hono&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hono&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;logger&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hono/logger&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;cors&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hono/cors&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;timing&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hono/timing&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;zValidator&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@hono/zod-validator&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zod&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Hono&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;*&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;*&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;cors&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;*&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;timing&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;Task&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;
  &lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
  &lt;span class="nx"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;
  &lt;span class="nx"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Install Hono&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Build REST API&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;nextId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;createTaskSchema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Title is required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;optional&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;updateTaskSchema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;optional&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="na"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;optional&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Task API&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;1.0.0&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;runtime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Bun + Hono&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}))&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/tasks&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;completedParam&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;completed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tasks&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;completedParam&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completed&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;completedParam&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;true&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;total&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/tasks&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;zValidator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;createTaskSchema&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;valid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;nextId&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;task&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;201&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/tasks/:id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Task not found&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;task&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;patch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/tasks/:id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;zValidator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;updateTaskSchema&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;valid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;index&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Task not found&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;index&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;index&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;index&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/tasks/:id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;index&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Task not found&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;splice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Deleted successfully&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Real terminal output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;bun run index.ts
&lt;span class="go"&gt;Started development server: http://localhost:3000

&amp;lt;-- GET /
&lt;/span&gt;&lt;span class="gp"&gt;--&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;GET / 200 4ms
&lt;span class="go"&gt;
&amp;lt;-- GET /tasks
&lt;/span&gt;&lt;span class="gp"&gt;--&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;GET /tasks 200 2ms
&lt;span class="go"&gt;
&amp;lt;-- POST /tasks
&lt;/span&gt;&lt;span class="gp"&gt;--&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;POST /tasks 201 4ms
&lt;span class="go"&gt;
&amp;lt;-- GET /tasks/3
&lt;/span&gt;&lt;span class="gp"&gt;--&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;GET /tasks/3 200 0ms
&lt;span class="go"&gt;
&amp;lt;-- PATCH /tasks/2
&lt;/span&gt;&lt;span class="gp"&gt;--&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;PATCH /tasks/2 200 0ms
&lt;span class="go"&gt;
&amp;lt;-- DELETE /tasks/1
&lt;/span&gt;&lt;span class="gp"&gt;--&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;DELETE /tasks/1 200 0ms
&lt;span class="go"&gt;
&amp;lt;-- POST /tasks  (empty title)
&lt;/span&gt;&lt;span class="gp"&gt;--&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;POST /tasks 400 0ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Performance numbers: first request 4ms, warm requests sub-millisecond (0ms in logger output). Running the same logic in Express on the same machine showed 1〜2ms warm. The real production edge gap would likely be larger.&lt;/p&gt;

&lt;p&gt;The reason for this performance: Bun's JavaScriptCore engine plus Hono's Trie-based router. Hono's router matches routes near O(1) regardless of how many routes you add — no linear scanning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cloudflare Workers Deployment — Zero Code Changes
&lt;/h2&gt;

&lt;p&gt;The biggest Hono advantage: changing the deployment target barely changes the code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bun add &lt;span class="nt"&gt;-g&lt;/span&gt; wrangler
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="c"&gt;# wrangler.toml&lt;/span&gt;
&lt;span class="py"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"hono-task-api"&lt;/span&gt;
&lt;span class="py"&gt;main&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"src/worker.ts"&lt;/span&gt;
&lt;span class="py"&gt;compatibility_date&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"2024-09-23"&lt;/span&gt;

&lt;span class="nn"&gt;[vars]&lt;/span&gt;
&lt;span class="py"&gt;ENVIRONMENT&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"production"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Connecting Cloudflare Workers environment variable types to Hono:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/worker.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Hono&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hono&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;cors&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hono/cors&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;Bindings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;ENVIRONMENT&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
  &lt;span class="na"&gt;DB&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;D1Database&lt;/span&gt;
  &lt;span class="na"&gt;KV&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;KVNamespace&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;Variables&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Hono&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;Bindings&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Bindings&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;Variables&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Variables&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;*&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;cors&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/health&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; 
    &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ENVIRONMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;// type-safe: string&lt;/span&gt;
    &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;// D1 database query&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/tasks&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DB&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prepare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SELECT * FROM tasks&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Simulate Cloudflare Workers locally&lt;/span&gt;
wrangler dev

&lt;span class="c"&gt;# Production deploy&lt;/span&gt;
wrangler deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I didn't verify &lt;code&gt;wrangler deploy&lt;/code&gt; — that requires an actual Cloudflare account. The code structure is exactly as shown above, and the only difference from the local Bun server is how you access bindings like &lt;code&gt;c.env.DB&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/en/blog/en/cloudflare-agents-week-2026-autonomous-infrastructure"&gt;Cloudflare Workers agent infrastructure&lt;/a&gt; shows how Hono sits at the API layer in Cloudflare-based AI agent systems. It's already being used this way in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  Type-Safe Middleware with Variables
&lt;/h2&gt;

&lt;p&gt;Express required extending interfaces to get type-safe access to &lt;code&gt;req.user&lt;/code&gt;. Hono handles this more cleanly with the &lt;code&gt;Variables&lt;/code&gt; generic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;Variables&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
  &lt;span class="na"&gt;requestId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Hono&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;Variables&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Variables&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;// Auth middleware&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/tasks/*&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;authHeader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;header&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Authorization&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;authHeader&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;startsWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Bearer &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Unauthorized&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;userId&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user-123&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;requestId&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;crypto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;randomUUID&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;// Access in route handler — fully typed&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/tasks&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;userId&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;       &lt;span class="c1"&gt;// inferred as string&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;requestId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;requestId&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// inferred as string&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;requestId&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;c.get('userId')&lt;/code&gt; returns &lt;code&gt;string&lt;/code&gt; — TypeScript infers this from the &lt;code&gt;Variables&lt;/code&gt; declaration. With Express, this inference didn't happen automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Found Frustrating
&lt;/h2&gt;

&lt;p&gt;There are real limitations worth naming.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ecosystem depth&lt;/strong&gt;: Fastify's plugin ecosystem is battle-hardened. &lt;code&gt;fastify-swagger&lt;/code&gt; auto-generates OpenAPI specs. &lt;code&gt;fastify-multipart&lt;/code&gt; handles file uploads. These are validated, maintained plugins. Hono's third-party ecosystem is thinner. The official middleware covers the basics, but unusual requirements mean writing your own.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;D1 local dev experience&lt;/strong&gt;: Testing against Cloudflare D1 locally requires &lt;code&gt;wrangler dev&lt;/code&gt;, which requires an actual Cloudflare account to configure bindings. SQLite compatibility makes Drizzle/Prisma usable, but the local dev setup is more involved than Express + PostgreSQL.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;wrangler dev&lt;/code&gt; cold start&lt;/strong&gt;: The first run of &lt;code&gt;wrangler dev&lt;/code&gt; is slow because it emulates the Cloudflare runtime. Running with Bun directly starts instantly — but that skips Workers-specific behavior testing.&lt;/p&gt;

&lt;p&gt;If edge deployment isn't your goal and you're building a conventional server, Fastify is more mature than Hono. The &lt;a href="https://dev.to/en/blog/en/ollama-fastapi-production-deployment-guide-2026"&gt;Ollama + FastAPI approach&lt;/a&gt; — different language, same concept — is another valid path.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Choose Hono
&lt;/h2&gt;

&lt;p&gt;My judgment:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Hono when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cloudflare Workers, Deno Deploy, or Bun are your deployment targets&lt;/li&gt;
&lt;li&gt;You want TypeScript type safety from the first line&lt;/li&gt;
&lt;li&gt;Bundle size and cold start time matter for your service&lt;/li&gt;
&lt;li&gt;Small team, fast start, minimal boilerplate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Don't bother switching when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your team is comfortable with Express or Fastify and has no edge deployment plans&lt;/li&gt;
&lt;li&gt;You need a mature plugin ecosystem for enterprise-scale services&lt;/li&gt;
&lt;li&gt;Heavy integration with legacy Node.js code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hono's GitHub stars crossed 66,000 in 2026. If you've already &lt;a href="https://dev.to/en/blog/en/bun-shell-scripting-practical-guide-2026"&gt;set up a Bun Shell scripting environment&lt;/a&gt;, adding Hono is the logical next step. Same runtime, same package manager, same TypeScript ecosystem — API server included.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cheat Sheet — Patterns I Look Up Every Time
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Query parameter&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;page&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;limit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;limit&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;10&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// Path parameter&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// Request header&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;auth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;header&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Authorization&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// JSON response with status&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;201&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// Text response&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;OK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// Redirect&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;redirect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/new-path&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;301&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// Streaming response&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;// Cloudflare Workers env variable&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dbUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DATABASE_URL&lt;/span&gt;

&lt;span class="c1"&gt;// Route grouping&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;api&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Hono&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nx"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/users&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...)&lt;/span&gt;
&lt;span class="nx"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/users&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...)&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/v1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;api&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Wrap-Up — Notes After Running It
&lt;/h2&gt;

&lt;p&gt;This post started from &lt;code&gt;bun add hono @hono/zod-validator zod&lt;/code&gt; and worked through a full CRUD API. In-memory storage limits what you can call "production-ready," but the routing, middleware, and Zod validation integration all checked out.&lt;/p&gt;

&lt;p&gt;The thing that impressed me most was type inference. Data from &lt;code&gt;c.req.valid('json')&lt;/code&gt; is immediately typed by the Zod schema. Data stored with &lt;code&gt;c.set('userId', ...)&lt;/code&gt; comes back as &lt;code&gt;string&lt;/code&gt; from &lt;code&gt;c.get('userId')&lt;/code&gt;. TypeScript doesn't lose track of types as they flow through the middleware chain.&lt;/p&gt;

&lt;p&gt;I won't claim there's no reason to keep using Express. But if you're starting a new project with TypeScript and Bun and have edge deployment in mind, Hono is worth using right now.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Test Environment&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Bun: 1.3.14&lt;/li&gt;
&lt;li&gt;hono: 4.12.23&lt;/li&gt;
&lt;li&gt;@hono/zod-validator: 0.8.0&lt;/li&gt;
&lt;li&gt;zod: 4.4.3&lt;/li&gt;
&lt;li&gt;typescript: 5.9.3&lt;/li&gt;
&lt;li&gt;macOS 15.x (Apple Silicon)&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>hono</category>
      <category>typescript</category>
      <category>restapi</category>
      <category>cloudflareworkers</category>
    </item>
    <item>
      <title>Openpyxl's Relevance for Freelance Data Cleaning and Automation in 2023: Addressing Concerns and Solutions</title>
      <dc:creator>Roman Dubrovin</dc:creator>
      <pubDate>Wed, 03 Jun 2026 06:39:44 +0000</pubDate>
      <link>https://dev.to/romdevin/openpyxls-relevance-for-freelance-data-cleaning-and-automation-in-2023-addressing-concerns-and-4glm</link>
      <guid>https://dev.to/romdevin/openpyxls-relevance-for-freelance-data-cleaning-and-automation-in-2023-addressing-concerns-and-4glm</guid>
      <description>&lt;h2&gt;
  
  
  Introduction: The Question of Relevance
&lt;/h2&gt;

&lt;p&gt;Imagine you’re a college student, fresh off mastering &lt;strong&gt;pandas&lt;/strong&gt;, and you’re eyeing the freelancing market for data cleaning and automation gigs. You’ve heard of &lt;strong&gt;openpyxl&lt;/strong&gt;, but as you dig deeper, you hit a wall: every resource seems to peg it as a relic for handling &lt;em&gt;2010 Excel sheets&lt;/em&gt;. That’s it. No modern use cases, no integration with cutting-edge tools, just a dusty library stuck in the past. So, you pause. Is openpyxl still relevant in 2023, or is it a dead end for someone trying to build a competitive freelancing portfolio?&lt;/p&gt;

&lt;p&gt;This dilemma isn’t just about openpyxl—it’s about the &lt;em&gt;mechanism of perception&lt;/em&gt; in tech. When a tool is associated with outdated formats, its capabilities are often &lt;strong&gt;misinterpreted or overlooked&lt;/strong&gt;. Openpyxl’s documentation and community discourse rarely highlight its modern applications, leaving newcomers like you to assume it’s obsolete. But here’s the catch: openpyxl isn’t just a 2010 Excel handler. It’s a &lt;em&gt;low-level Excel manipulator&lt;/em&gt; that, when paired with libraries like pandas and numpy, can handle complex tasks that these libraries alone can’t. The problem isn’t openpyxl’s functionality—it’s the &lt;em&gt;information gap&lt;/em&gt; between its perceived and actual utility.&lt;/p&gt;

&lt;p&gt;The stakes are clear: if you dismiss openpyxl as outdated, you risk missing out on a tool that could &lt;strong&gt;complement your pandas and numpy skills&lt;/strong&gt;, making your freelancing services more efficient and versatile. But if you invest time in it without understanding its modern applications, you might waste effort on a tool that doesn’t align with current demands. The question isn’t whether openpyxl is relevant—it’s whether you’re looking at it through the right lens.&lt;/p&gt;

&lt;p&gt;In this investigation, we’ll dissect openpyxl’s role in 2023 freelancing, addressing its perceived limitations and uncovering its hidden strengths. By the end, you’ll have a clear rule for deciding whether to include it in your toolkit: &lt;strong&gt;If your freelancing gigs involve Excel-specific tasks that pandas can’t handle natively (e.g., formatting, metadata manipulation, or legacy file compatibility), use openpyxl alongside pandas.&lt;/strong&gt; Otherwise, stick to pandas alone. Let’s dive in.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Openpyxl: Features and Limitations
&lt;/h2&gt;

&lt;p&gt;Let’s cut through the noise: &lt;strong&gt;openpyxl is not just a relic for 2010 Excel sheets.&lt;/strong&gt; This misperception stems from its historical association with older formats, but the library’s core functionality extends far beyond legacy compatibility. Openpyxl is a &lt;em&gt;low-level Excel manipulator&lt;/em&gt;, meaning it interacts directly with the structural elements of Excel files (e.g., cells, worksheets, metadata) at a granular level. This distinguishes it from higher-level libraries like pandas, which prioritize data frames and analysis over Excel-specific tasks.&lt;/p&gt;

&lt;p&gt;Here’s the mechanism: When you open an Excel file with openpyxl, the library parses the file’s XML structure, allowing you to modify cells, adjust formatting, or manipulate metadata programmatically. Unlike pandas, which treats Excel files as data containers, openpyxl &lt;strong&gt;directly edits the file’s underlying architecture.&lt;/strong&gt; This is why it’s indispensable for tasks like preserving Excel-specific features (e.g., conditional formatting, pivot tables) that pandas would otherwise strip or ignore.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Functionalities
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Excel File Creation/Modification:&lt;/strong&gt; Openpyxl can create new Excel files or modify existing ones, including .xlsx, .xlsm, and .xltx formats. It’s not limited to 2010—it supports modern Excel versions up to 2023.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cell-Level Manipulation:&lt;/strong&gt; You can read, write, or format individual cells, including merging, splitting, or applying styles. This is where openpyxl outperforms pandas, which struggles with cell-specific operations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metadata Handling:&lt;/strong&gt; Openpyxl allows you to manipulate metadata like sheet names, properties, or embedded macros—tasks pandas cannot handle natively.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Legacy Compatibility:&lt;/strong&gt; Yes, it works with older Excel formats, but this is a feature, not a limitation. For freelancing gigs involving legacy systems, this capability is a competitive edge.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Known Limitations
&lt;/h2&gt;

&lt;p&gt;Openpyxl isn’t perfect. Its &lt;strong&gt;low-level nature makes it verbose&lt;/strong&gt; for simple data extraction tasks. For example, reading a large dataset into a pandas DataFrame is more efficient than iterating through cells with openpyxl. Additionally, it lacks built-in support for advanced data analysis—a job better suited for pandas or numpy. The risk here is &lt;em&gt;overusing openpyxl&lt;/em&gt; for tasks it’s not optimized for, leading to slower execution times or bloated code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Relevance Mechanism: When to Use Openpyxl
&lt;/h2&gt;

&lt;p&gt;Openpyxl’s relevance hinges on the &lt;strong&gt;specific task requirements.&lt;/strong&gt; Here’s the decision rule:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;If X (task requires Excel-specific functionalities like formatting, metadata manipulation, or legacy compatibility) -&amp;gt; Use Y (openpyxl alongside pandas/numpy)&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If X (task is purely data analysis or manipulation without Excel-specific needs) -&amp;gt; Use Y (pandas/numpy alone)&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For instance, if a freelancing gig involves cleaning a dataset &lt;em&gt;and&lt;/em&gt; preserving Excel formatting, openpyxl bridges the gap pandas leaves. Without it, you’d either lose formatting or manually recreate it—a time sink.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Insight: Avoiding Common Errors
&lt;/h2&gt;

&lt;p&gt;A typical mistake is &lt;strong&gt;dismissing openpyxl as redundant&lt;/strong&gt; because pandas can read/write Excel files. This overlooks the library’s unique capabilities. Another error is &lt;strong&gt;over-relying on openpyxl&lt;/strong&gt; for data analysis, where pandas is more efficient. The optimal approach is &lt;em&gt;integration&lt;/em&gt;: use pandas for data manipulation and openpyxl for Excel-specific tasks.&lt;/p&gt;

&lt;p&gt;For college students entering freelancing, understanding this synergy is critical. Openpyxl isn’t outdated—it’s a &lt;strong&gt;specialized tool&lt;/strong&gt; that complements modern libraries. Dismissing it risks leaving money on the table for gigs requiring Excel expertise.&lt;/p&gt;

&lt;h2&gt;
  
  
  Industry Trends and Client Expectations: Is Openpyxl Still in the Game?
&lt;/h2&gt;

&lt;p&gt;Let’s cut to the chase: &lt;strong&gt;openpyxl isn’t dead&lt;/strong&gt;, but its relevance hinges on how you wield it. The misconception that it’s a relic for 2010 Excel sheets stems from its &lt;em&gt;low-level XML parsing mechanism&lt;/em&gt;, which initially targeted older file formats. However, this same mechanism now supports &lt;strong&gt;.xlsx, .xlsm, and .xltx up to 2023 versions&lt;/strong&gt; by directly manipulating the underlying XML structure of Excel files. The problem? Its documentation and community discourse &lt;em&gt;fail to highlight this evolution&lt;/em&gt;, leaving newcomers like you in the dark.&lt;/p&gt;

&lt;p&gt;Here’s the causal chain: &lt;strong&gt;Clients demand tools that handle modern Excel features&lt;/strong&gt; (e.g., dynamic arrays, enhanced formatting). Openpyxl’s &lt;em&gt;direct file editing capability&lt;/em&gt; preserves these features by modifying the file architecture at the XML level, unlike pandas, which strips them during data extraction. For instance, if a client needs &lt;strong&gt;conditional formatting or pivot tables retained&lt;/strong&gt;, openpyxl’s &lt;em&gt;cell-level manipulation&lt;/em&gt; (merging, splitting, styling) ensures these aren’t lost—something pandas can’t do natively.&lt;/p&gt;

&lt;p&gt;But there’s a risk: &lt;strong&gt;Overusing openpyxl for non-Excel-specific tasks&lt;/strong&gt; (e.g., large dataset analysis) triggers &lt;em&gt;verbose code execution&lt;/em&gt;, slowing performance. The mechanism? Openpyxl’s XML parsing is &lt;em&gt;resource-intensive&lt;/em&gt;, unlike pandas’ optimized DataFrame operations. Thus, the rule is: &lt;strong&gt;If the task requires Excel-specific functionalities (formatting, metadata, legacy compatibility), use openpyxl. Otherwise, pandas alone suffices.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Edge Cases and Practical Insights
&lt;/h2&gt;

&lt;p&gt;Consider a gig involving &lt;strong&gt;legacy Excel files with embedded macros&lt;/strong&gt;. Openpyxl’s &lt;em&gt;metadata handling&lt;/em&gt; allows you to extract or modify these macros, a task pandas can’t perform. However, if the client needs &lt;strong&gt;pure data analysis without Excel-specific features&lt;/strong&gt;, sticking to pandas avoids the overhead of openpyxl’s XML parsing.&lt;/p&gt;

&lt;p&gt;Another edge case: &lt;strong&gt;Freelancers often juggle multiple file formats.&lt;/strong&gt; Openpyxl’s &lt;em&gt;legacy compatibility&lt;/em&gt; gives you an edge for clients stuck on older systems, while its &lt;em&gt;modern format support&lt;/em&gt; ensures you’re not left behind. The key is &lt;strong&gt;integration&lt;/strong&gt;: Use pandas for data manipulation and openpyxl for Excel-specific tasks. This &lt;em&gt;hybrid approach&lt;/em&gt; optimizes efficiency and preserves features, making your services more competitive.&lt;/p&gt;

&lt;h2&gt;
  
  
  Decision Dominance: When to Use Openpyxl
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use openpyxl if:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;The task requires &lt;em&gt;Excel-specific functionalities&lt;/em&gt; (e.g., formatting, metadata, legacy compatibility).&lt;/li&gt;
&lt;li&gt;The client demands &lt;em&gt;preservation of Excel features&lt;/em&gt; (e.g., conditional formatting, pivot tables).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Avoid openpyxl if:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;The task is &lt;em&gt;pure data analysis&lt;/em&gt; without Excel-specific needs.&lt;/li&gt;
&lt;li&gt;You’re dealing with &lt;em&gt;large datasets&lt;/em&gt; where pandas’ efficiency outweighs openpyxl’s capabilities.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Typical choice errors? &lt;strong&gt;Dismissing openpyxl as outdated&lt;/strong&gt; or &lt;strong&gt;over-relying on it for data analysis.&lt;/strong&gt; The former overlooks its unique Excel-specific capabilities, while the latter leads to &lt;em&gt;inefficient code execution&lt;/em&gt; due to its resource-intensive XML parsing. The optimal solution? &lt;strong&gt;Combine pandas and openpyxl&lt;/strong&gt; based on task requirements. This hybrid approach ensures you’re neither underutilizing openpyxl nor misusing it, making your freelancing services both efficient and competitive.&lt;/p&gt;

&lt;h2&gt;
  
  
  Comparative Analysis: Openpyxl vs. Alternatives
&lt;/h2&gt;

&lt;p&gt;As a college student stepping into freelancing, the question of whether &lt;strong&gt;openpyxl&lt;/strong&gt; is still relevant is valid, especially given its association with older Excel formats. However, dismissing it as outdated overlooks its unique capabilities and complementary role alongside modern libraries like &lt;strong&gt;pandas&lt;/strong&gt; and &lt;strong&gt;numpy&lt;/strong&gt;. Below, we dissect openpyxl’s strengths, weaknesses, and use cases in comparison to alternatives, backed by technical mechanisms and practical insights.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Core Mechanisms and Technical Insights
&lt;/h3&gt;

&lt;p&gt;Openpyxl operates via &lt;strong&gt;low-level XML parsing&lt;/strong&gt;, directly manipulating Excel file structures (cells, worksheets, metadata). This mechanism enables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Excel-specific feature preservation&lt;/strong&gt;: Unlike pandas, which strips conditional formatting, pivot tables, and macros during extraction, openpyxl preserves these features by editing the file architecture directly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Modern and legacy compatibility&lt;/strong&gt;: Supports .xlsx, .xlsm, and .xltx formats up to Excel 2023, while also handling legacy files with embedded macros.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Mechanism:&lt;/em&gt; XML parsing allows openpyxl to interact with the file’s underlying structure, ensuring features are retained. However, this process is &lt;strong&gt;resource-intensive&lt;/strong&gt;, slowing performance for large datasets or non-Excel-specific tasks.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Comparative Strengths and Weaknesses
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Openpyxl vs. Pandas
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Strengths of openpyxl&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Excel-specific tasks&lt;/strong&gt;: Handles formatting, metadata manipulation, and legacy compatibility—tasks pandas cannot perform natively.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feature preservation&lt;/strong&gt;: Ensures Excel features remain intact, critical for client deliverables.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Weaknesses of openpyxl&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Inefficiency for data analysis&lt;/strong&gt;: Lacks built-in analysis capabilities, making it slower than pandas for large datasets.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verbose syntax&lt;/strong&gt;: Requires more code for simple tasks compared to pandas’ concise DataFrame operations.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Mechanism:&lt;/em&gt; Pandas optimizes data extraction and analysis via DataFrame structures, bypassing Excel’s file architecture. Openpyxl, by contrast, prioritizes file integrity and feature preservation, making it slower but more versatile for Excel-specific tasks.&lt;/p&gt;

&lt;h4&gt;
  
  
  Openpyxl vs. Other Libraries (e.g., xlwings, pyexcel)
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;xlwings&lt;/strong&gt;: Excels in integrating Excel with Python for automation but requires Excel to be installed. Openpyxl operates independently, making it more portable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pyexcel&lt;/strong&gt;: Simplifies file format conversions but lacks openpyxl’s granular control over Excel features.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Mechanism:&lt;/em&gt; Openpyxl’s direct XML manipulation provides finer control over Excel files, whereas alternatives prioritize ease of use or integration with external tools.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Optimal Usage Guidelines and Decision Rules
&lt;/h3&gt;

&lt;p&gt;To maximize efficiency and competitiveness in freelancing, follow these rules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;If task requires Excel-specific functionalities (formatting, metadata, legacy compatibility)&lt;/strong&gt; → &lt;strong&gt;Use openpyxl&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If task is purely data analysis without Excel-specific needs&lt;/strong&gt; → &lt;strong&gt;Use pandas/numpy&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For hybrid tasks (e.g., data cleaning + Excel formatting)&lt;/strong&gt; → &lt;strong&gt;Combine pandas and openpyxl&lt;/strong&gt;. Use pandas for data manipulation and openpyxl for Excel-specific tasks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Mechanism:&lt;/em&gt; Combining libraries leverages their strengths: pandas’ efficiency in data handling and openpyxl’s precision in Excel manipulation. This hybrid approach minimizes performance bottlenecks and ensures feature preservation.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Edge Cases and Risk Mitigation
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Edge Cases Where Openpyxl Excels
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Legacy systems&lt;/strong&gt;: Openpyxl’s compatibility with older Excel formats provides an edge for clients using outdated systems.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feature-rich deliverables&lt;/strong&gt;: Clients requiring conditional formatting, pivot tables, or macros benefit from openpyxl’s preservation capabilities.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Common Errors and Their Mechanisms
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dismissing openpyxl as outdated&lt;/strong&gt;: Overlooks its unique Excel capabilities, leading to suboptimal solutions for Excel-specific tasks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Over-relying on openpyxl&lt;/strong&gt;: Using it for data analysis instead of pandas results in inefficient code execution due to its resource-intensive XML parsing.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Mechanism:&lt;/em&gt; Misuse of openpyxl for non-Excel-specific tasks slows execution, as its XML parsing is not optimized for large datasets or analysis.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Professional Judgment and Conclusion
&lt;/h3&gt;

&lt;p&gt;Openpyxl remains a &lt;strong&gt;relevant and valuable tool&lt;/strong&gt; for freelancers, particularly when integrated with pandas and numpy. Its ability to handle Excel-specific tasks and preserve features complements the data manipulation strengths of modern libraries. However, its effectiveness depends on task requirements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use openpyxl if&lt;/strong&gt;: The task involves Excel-specific functionalities or requires feature preservation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Avoid openpyxl if&lt;/strong&gt;: The task is purely data analysis or involves large datasets without Excel-specific needs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By understanding openpyxl’s mechanisms and limitations, college students and new freelancers can make informed decisions, ensuring their services are both efficient and competitive in the growing data cleaning and automation market.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion: Is Openpyxl Still Relevant?
&lt;/h2&gt;

&lt;p&gt;After a deep dive into openpyxl's capabilities and its role in modern data cleaning and automation, the answer is clear: &lt;strong&gt;Yes, openpyxl remains highly relevant for freelancers in 2023&lt;/strong&gt;, especially when paired with libraries like pandas and numpy. However, its relevance hinges on understanding its specific strengths and limitations, as well as the nature of the tasks at hand.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Findings
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Misperception Debunked:&lt;/strong&gt; Openpyxl is not just a tool for 2010 Excel sheets. It supports modern formats (up to Excel 2023) and offers low-level manipulation of Excel files, including &lt;em&gt;cell-level formatting, metadata handling, and legacy compatibility&lt;/em&gt;. This is achieved through &lt;em&gt;XML parsing&lt;/em&gt;, which directly edits the file structure, preserving features like &lt;em&gt;conditional formatting and pivot tables&lt;/em&gt; that pandas strips during extraction.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complementary Role:&lt;/strong&gt; Openpyxl excels at tasks pandas cannot handle natively, such as &lt;em&gt;Excel-specific formatting and metadata manipulation&lt;/em&gt;. For example, while pandas efficiently extracts and analyzes data, it lacks the ability to preserve Excel features like &lt;em&gt;macros or conditional formatting&lt;/em&gt;. Openpyxl bridges this gap, making it a valuable complement rather than a replacement.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance Trade-offs:&lt;/strong&gt; Openpyxl’s XML parsing is &lt;em&gt;resource-intensive&lt;/em&gt;, slowing performance for large datasets or non-Excel tasks. This is because XML parsing involves &lt;em&gt;deserializing the entire file structure&lt;/em&gt;, which is overkill for simple data extraction. Pandas, with its optimized DataFrame operations, outperforms openpyxl in pure data analysis tasks.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Actionable Advice for Freelancers
&lt;/h3&gt;

&lt;p&gt;To leverage openpyxl effectively, follow these guidelines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use openpyxl if:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;The task requires &lt;em&gt;Excel-specific functionalities&lt;/em&gt; (e.g., formatting, metadata, legacy compatibility).&lt;/li&gt;
&lt;li&gt;You need to &lt;em&gt;preserve Excel features&lt;/em&gt; like conditional formatting or pivot tables.&lt;/li&gt;
&lt;li&gt;You’re working with &lt;em&gt;legacy systems&lt;/em&gt; or older Excel formats.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Avoid openpyxl if:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;The task is purely &lt;em&gt;data analysis&lt;/em&gt; without Excel-specific needs—use pandas instead.&lt;/li&gt;
&lt;li&gt;You’re handling &lt;em&gt;large datasets&lt;/em&gt; where performance is critical.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Hybrid Approach:&lt;/strong&gt; Combine pandas for data manipulation and openpyxl for Excel-specific tasks. For example, use pandas to clean and analyze data, then openpyxl to format the output and preserve Excel features. This minimizes performance bottlenecks and maximizes efficiency.&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Common Errors to Avoid
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dismissing openpyxl:&lt;/strong&gt; Overlooking its unique Excel capabilities can limit your ability to deliver feature-rich, client-ready deliverables. Mechanism: Clients often require formatted reports or legacy compatibility, which openpyxl handles better than pandas.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Over-relying on openpyxl:&lt;/strong&gt; Using it for data analysis instead of pandas leads to &lt;em&gt;inefficient code execution&lt;/em&gt; due to its resource-intensive XML parsing. Mechanism: XML parsing involves deserializing the entire file structure, which is unnecessary for simple data extraction tasks.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Decision Rule
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;If the task requires Excel-specific functionalities or feature preservation → use openpyxl.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;If the task is purely data analysis or involves large datasets → use pandas/numpy.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;For hybrid tasks → combine pandas (data manipulation) and openpyxl (Excel-specific tasks).&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Final Verdict
&lt;/h3&gt;

&lt;p&gt;Openpyxl is not outdated—it’s a specialized tool that, when used correctly, enhances your freelancing services. By integrating it with pandas and numpy, you can offer &lt;em&gt;competitive, efficient, and feature-rich solutions&lt;/em&gt; for data cleaning and automation gigs. As a college student entering the freelancing market, mastering this hybrid approach will set you apart and ensure your services meet current industry demands.&lt;/p&gt;

</description>
      <category>openpyxl</category>
      <category>pandas</category>
      <category>automation</category>
      <category>excel</category>
    </item>
    <item>
      <title>AI Coding Agents in 2026: From Pair Programming to Autonomous Teams</title>
      <dc:creator>A3E Ecosystem</dc:creator>
      <pubDate>Wed, 03 Jun 2026 06:39:18 +0000</pubDate>
      <link>https://dev.to/a3e_ecosystem/ai-coding-agents-in-2026-from-pair-programming-to-autonomous-teams-4f7o</link>
      <guid>https://dev.to/a3e_ecosystem/ai-coding-agents-in-2026-from-pair-programming-to-autonomous-teams-4f7o</guid>
      <description>&lt;h1&gt;
  
  
  AI Coding Agents in 2026: From Pair Programming to Autonomous Teams
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;Slug: &lt;code&gt;ai-coding-agents-2026-stack-comparison&lt;/code&gt;&lt;/em&gt;  &lt;/p&gt;




&lt;h2&gt;
  
  
  1. The Three Categories That Actually Matter
&lt;/h2&gt;

&lt;p&gt;The 2024‑2025 hype cycle treated every AI coding tool as a single‑dimensional “best‑of‑list.” 2026 data shows that professional developers now average &lt;strong&gt;2.4 tools per workflow&lt;/strong&gt; (Stack Overflow Survey 2025).  The real decision is architectural:  &lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Goal&lt;/th&gt;
&lt;th&gt;Typical Agent Type&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Line‑level editing&lt;/td&gt;
&lt;td&gt;Speed, low latency&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Editor assistants&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Repo‑level planning&lt;/td&gt;
&lt;td&gt;Context depth, multi‑file changes&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Autonomous agents&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Enterprise governance&lt;/td&gt;
&lt;td&gt;Isolation, audit, CI/CD integration&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Platform agents&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Choosing a “one best tool” ignores the trade‑off between &lt;strong&gt;context window size&lt;/strong&gt; (how many tokens the model can see) and &lt;strong&gt;execution speed&lt;/strong&gt; (how fast the tool returns a suggestion).  A narrow‑window editor assistant excels at instant autocomplete, while a wide‑window autonomous agent can rewrite an entire microservice in a single run.  The three‑tier framework aligns the tool’s strengths with the architectural layer where they matter most.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Tier 1: Editor Assistants — Speed at the Line Level
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Market Position&lt;/th&gt;
&lt;th&gt;Key Feature (2026)&lt;/th&gt;
&lt;th&gt;Pricing (per developer)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cursor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$500 M+ ARR, fastest growth in Q1 2026&lt;/td&gt;
&lt;td&gt;Parallel agents update git worktrees; 2‑second latency on 8‑core laptops&lt;/td&gt;
&lt;td&gt;$15 /mo (individual) – $120 /mo (team)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GitHub Copilot&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4.7 M paid subscriptions, 75 % YoY growth&lt;/td&gt;
&lt;td&gt;Agent Mode with multi‑agent workflows; deep VS Code integration&lt;/td&gt;
&lt;td&gt;$10 /mo (individual) – $100 /mo (enterprise)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Windsurf&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1.2 M active users, strong UI polish&lt;/td&gt;
&lt;td&gt;Real‑time code‑style enforcement; limited to 4‑file context&lt;/td&gt;
&lt;td&gt;Free tier up to 5 k lines, $30 /mo premium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tabnine&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Enterprise‑only after 2026 pivot&lt;/td&gt;
&lt;td&gt;Air‑gapped deployment; NVIDIA Nemotron 4‑bit models for on‑prem inference&lt;/td&gt;
&lt;td&gt;$200 /mo per seat (minimum 10 seats)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;When to choose each&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cursor&lt;/strong&gt; – prioritize raw typing speed and git‑aware suggestions. Ideal for startups that need rapid iteration without heavy IDE lock‑in.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Copilot&lt;/strong&gt; – best for teams already on GitHub, especially when you want the same model to power pull‑request suggestions and code reviews.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Windsurf&lt;/strong&gt; – fits developers who value UI polish and strict style enforcement over raw speed.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tabnine&lt;/strong&gt; – the only option for regulated industries that require complete data isolation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All four tools expose an OpenAI‑compatible completion endpoint, making it easy to swap the backend model without breaking the editor integration.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Tier 2: Autonomous Agents — Depth at the Repo Level
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Agent&lt;/th&gt;
&lt;th&gt;SWE‑bench Score (2026)&lt;/th&gt;
&lt;th&gt;Context Window&lt;/th&gt;
&lt;th&gt;Execution Model&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Claude Code&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;80.8 % (Opus 4.6)&lt;/td&gt;
&lt;td&gt;1 M tokens&lt;/td&gt;
&lt;td&gt;Terminal‑native, can run &lt;code&gt;git checkout&lt;/code&gt; and &lt;code&gt;npm test&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Codex CLI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;78.3 % (GPT‑4‑Turbo)&lt;/td&gt;
&lt;td&gt;800 k tokens&lt;/td&gt;
&lt;td&gt;“Go do this” prompt language; auto‑generates scripts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Aider&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;76.5 % (mixed model)&lt;/td&gt;
&lt;td&gt;600 k tokens&lt;/td&gt;
&lt;td&gt;CLI‑first, supports multi‑model backends&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OpenCode&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;72.0 % (Claude‑compatible)&lt;/td&gt;
&lt;td&gt;900 k tokens&lt;/td&gt;
&lt;td&gt;Provider‑agnostic; 90 % of Claude performance at 10 % cost&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cline&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;71.4 % (GPT‑4)&lt;/td&gt;
&lt;td&gt;500 k tokens&lt;/td&gt;
&lt;td&gt;VS Code sidecar, transparent tool control&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Real‑world scenarios&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fixing a production bug&lt;/strong&gt; – Claude Code can pull the failing commit, run the test suite, and suggest a patch in under two minutes.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Onboarding to a new codebase&lt;/strong&gt; – Codex CLI can generate a high‑level architecture diagram and scaffold unit tests for every module in a single run.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Writing comprehensive tests&lt;/strong&gt; – Aider’s multi‑model support lets you pair a cheap 8‑bit model for boilerplate with a premium 32‑bit model for edge‑case logic, reducing API spend by 35 %.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Autonomous agents excel when the task exceeds a few lines and requires &lt;strong&gt;repo‑wide context&lt;/strong&gt;. Their ability to execute shell commands means they can close the loop between suggestion and verification, something editor assistants cannot do.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Tier 3: Platform Agents — Governance at the Enterprise Level
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Core Capability&lt;/th&gt;
&lt;th&gt;Isolation Model&lt;/th&gt;
&lt;th&gt;Pricing&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Codegen (ClickUp)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Orchestrates multiple agents, injects business metadata&lt;/td&gt;
&lt;td&gt;Containerized sandboxes per ticket&lt;/td&gt;
&lt;td&gt;$2 k/mo for 50 agents, $0.05 per execution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Devin&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ticket‑driven autonomous dev environment&lt;/td&gt;
&lt;td&gt;VM isolation with encrypted state&lt;/td&gt;
&lt;td&gt;$1.5 k/mo for 30 agents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;RooCode&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Reliability‑first change engine, rollback on test failure&lt;/td&gt;
&lt;td&gt;Kubernetes pods with role‑based access&lt;/td&gt;
&lt;td&gt;$2.2 k/mo for 40 agents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Augment&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;End‑to‑end CI/CD integration, auto‑scaling&lt;/td&gt;
&lt;td&gt;Multi‑tenant SaaS, audit logs&lt;/td&gt;
&lt;td&gt;$2.5 k/mo for 45 agents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;JetBrains Junie&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Deep integration with IntelliJ suite&lt;/td&gt;
&lt;td&gt;Sandboxed JVM processes&lt;/td&gt;
&lt;td&gt;$1.8 k/mo for 35 agents&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Enterprise criteria&lt;/strong&gt;  &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Security isolation&lt;/strong&gt; – agents must run in environments that prevent data leakage.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State persistence&lt;/strong&gt; – long‑running refactors need a persistent workspace.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost predictability&lt;/strong&gt; – flat‑rate pricing avoids surprise API bills.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit trails&lt;/strong&gt; – every change must be logged for compliance.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Platform agents are the glue that brings autonomous agents into a regulated CI/CD pipeline. They also provide a single point of governance for the editor assistants used by developers on the ground.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Building Your Stack — How to Combine Tiers Without Fragmentation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Common pattern
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Editor assistant&lt;/strong&gt; – daily driver for line‑level edits.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Autonomous agent&lt;/strong&gt; – invoked for complex refactors, test generation, or bug triage.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Platform agent (optional)&lt;/strong&gt; – sits in CI/CD to enforce policy and capture audit logs.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Integration layer: Model Context Protocol (MCP)
&lt;/h3&gt;

&lt;p&gt;MCP standardizes how tools exchange context, token limits, and execution results. Two popular implementations in 2026 are &lt;strong&gt;Zapier MCP&lt;/strong&gt; (hosted) and &lt;strong&gt;custom self‑hosted MCP servers&lt;/strong&gt; (Docker image &lt;code&gt;mcp/server:2.1&lt;/code&gt;). By routing all requests through MCP, you avoid “prompt fatigue” – the user stays in the editor while the backend swaps from Cursor to Claude Code and finally to Codegen without manual context copying.&lt;/p&gt;

&lt;h3&gt;
  
  
  Case studies
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Editor&lt;/th&gt;
&lt;th&gt;Autonomous&lt;/th&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Outcome&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;React front‑end dev&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cursor (VS Code)&lt;/td&gt;
&lt;td&gt;Claude Code (repo‑wide refactor)&lt;/td&gt;
&lt;td&gt;Codegen (ticket‑based deployment)&lt;/td&gt;
&lt;td&gt;Reduced feature turnaround from 5 days to 2 days; 30 % fewer PR comments.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data scientist&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Copilot (Jupyter)&lt;/td&gt;
&lt;td&gt;OpenCode on DeepSeek (cost‑optimized)&lt;/td&gt;
&lt;td&gt;Custom MCP server (on‑prem)&lt;/td&gt;
&lt;td&gt;Generated reproducible pipelines for 12 models in 3 hours; cut cloud spend by $4 k/month.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Enterprise team&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Copilot Business (GitHub Enterprise)&lt;/td&gt;
&lt;td&gt;RooCode (large‑scale migration)&lt;/td&gt;
&lt;td&gt;Tabnine air‑gapped + Codegen&lt;/td&gt;
&lt;td&gt;Completed monolith‑to‑microservice split in 6 weeks while maintaining full audit trail.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Avoiding fragmentation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Keep &lt;strong&gt;one MCP endpoint&lt;/strong&gt; per project.
&lt;/li&gt;
&lt;li&gt;Define &lt;strong&gt;context handoff rules&lt;/strong&gt;: if token usage exceeds 800 k, automatically route to the autonomous agent.
&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;feature flags&lt;/strong&gt; to enable or disable platform agents per branch, preventing accidental execution in dev environments.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  6. What’s Coming in Late 2026
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi‑agent orchestration&lt;/strong&gt; – agents will delegate tasks across tiers automatically (e.g., an editor assistant detects a pattern and spawns an autonomous agent).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent‑to‑agent communication&lt;/strong&gt; – MCP will become the universal protocol, allowing Claude Code to hand off a patch to RooCode for compliance checks.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;2 M+ token windows&lt;/strong&gt; – models from DeepMind and Anthropic will support context windows exceeding two million tokens, making whole‑codebase analysis routine.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SWE‑bench saturation&lt;/strong&gt; – scores have plateaued above 80 %; differentiation will shift to reliability, UX, and cost.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Open‑source catch‑up&lt;/strong&gt; – OpenCode, Aider, and Cline now cover 90 % of paid‑tool functionality at 10 % of the price, eroding the moat of proprietary agents.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Key Takeaways
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Stop asking “which agent is best”; ask “which category do I need at each layer.”
&lt;/li&gt;
&lt;li&gt;Editor assistants remain the daily driver for 90 % of coding work.
&lt;/li&gt;
&lt;li&gt;Autonomous agents are the new CLI for repo‑wide operations.
&lt;/li&gt;
&lt;li&gt;Platform agents matter only when you need audit trails and isolation.
&lt;/li&gt;
&lt;li&gt;MCP is the glue; a well‑designed integration layer determines stack performance.
&lt;/li&gt;
&lt;li&gt;Open‑source agents are eating the bottom; combine them with cheap APIs for maximum ROI.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Ready to future‑proof your development workflow? Choose the right tier, connect them with MCP, and let the agents do the heavy lifting.  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start building your three‑tier AI coding stack today.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>coding</category>
      <category>agents</category>
      <category>devtools</category>
    </item>
    <item>
      <title>Tool-Call Accuracy Is Lying to You: A Four-Layer Eval Stack for Agents</title>
      <dc:creator>Nikhil Pareek</dc:creator>
      <pubDate>Wed, 03 Jun 2026 06:37:49 +0000</pubDate>
      <link>https://dev.to/nikhil_pareek_13/tool-call-accuracy-is-lying-to-you-a-four-layer-eval-stack-for-agents-523p</link>
      <guid>https://dev.to/nikhil_pareek_13/tool-call-accuracy-is-lying-to-you-a-four-layer-eval-stack-for-agents-523p</guid>
      <description>&lt;p&gt;Here's a trace that reset how I think about evaluating tool-calling agents.&lt;/p&gt;

&lt;p&gt;An agent tries to book a flight. It calls &lt;code&gt;search_flights&lt;/code&gt; with &lt;code&gt;departure_date="next Friday"&lt;/code&gt;. The endpoint expected an ISO date, so it returns a &lt;code&gt;400&lt;/code&gt;. The agent retries the same string four times, then apologizes to the user and gives up.&lt;/p&gt;

&lt;p&gt;Now the part that actually bothered me. &lt;strong&gt;Tool selection was correct.&lt;/strong&gt; The model picked the right function out of a registry of 28. My tool-selection accuracy logged a clean &lt;code&gt;1.0&lt;/code&gt;. The aggregate task-completion logged a &lt;code&gt;0&lt;/code&gt;. And neither number told me which of three things broke:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the argument was wrong,&lt;/li&gt;
&lt;li&gt;the model never read the &lt;code&gt;400&lt;/code&gt; body, or&lt;/li&gt;
&lt;li&gt;the retry policy looped on the same input.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My eval wasn't wrong. It was asking the wrong question.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "tool-call accuracy" actually grades
&lt;/h2&gt;

&lt;p&gt;If the only thing you measure is &lt;em&gt;did the agent call the right tool&lt;/em&gt;, you're testing intent, not execution. Tool selection is necessary, not sufficient. It passes the moment the right function name shows up in the trace, completely blind to whether the arguments were garbage, whether the model read what came back, or whether it recovered from the &lt;code&gt;400&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That's the gap. The metric checks that the agent &lt;em&gt;started&lt;/em&gt; the right way. Production needs to know whether it &lt;em&gt;finished&lt;/em&gt; the right way.&lt;/p&gt;

&lt;h2&gt;
  
  
  The reframe: it's four eval problems, not one
&lt;/h2&gt;

&lt;p&gt;The thing I had to internalize is that tool-calling eval is four problems stacked, each with its own root cause:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Tool selection&lt;/strong&gt;, right tool, or correctly &lt;em&gt;no&lt;/em&gt; tool&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Argument extraction&lt;/strong&gt;, schema-valid and semantically correct&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Result utilization&lt;/strong&gt;, did it actually use what the tool returned&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error recovery&lt;/strong&gt;, did it retry, fall back, or escalate&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Score them separately and "the agent failed" collapses into "the argument extractor regressed on date strings on the flight-booking path." One bisect instead of three days.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I rebuilt
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Layer 1: Tool selection (with the bucket everyone drops)
&lt;/h3&gt;

&lt;p&gt;F1 on the tool name, so a 28-tool registry doesn't hide a regression on one rare endpoint behind a strong global mean:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fi.evals&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;evaluate&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function_name_match&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;predicted_tool&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ground_truth_tool&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The piece almost every post skips is the &lt;strong&gt;irrelevance bucket&lt;/strong&gt;: test cases where the gold answer is "no tool call" (a greeting, a clarification, an in-model factual question). Without those, you can't catch the regression where a prompt revision makes the model bolder about calling &lt;code&gt;search&lt;/code&gt; on every input. BFCL added the bucket for exactly this reason; build it into your private set the same way.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 2: Argument extraction
&lt;/h3&gt;

&lt;p&gt;Schema validation runs first and is deterministic. Pydantic on the model output is the cheapest possible gate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ValidationError&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SearchFlightsArgs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;departure_airport&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^[A-Z]{3}$&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;arrival_airport&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^[A-Z]{3}$&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;departure_date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^\d{4}-\d{2}-\d{2}$&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;cabin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^(economy|premium|business|first)$&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But schema-valid isn't correct. &lt;code&gt;departure_date="2026-01-01"&lt;/code&gt; validates fine and is still wrong if the user said "next Friday." That semantic class needs an LLM judge scoring whether the argument captured the user's intent. &lt;code&gt;customer_id="me"&lt;/code&gt; returning someone else's account is the failure that schema validation will never see.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 3: Result utilization (the layer most posts skip entirely)
&lt;/h3&gt;

&lt;p&gt;The tool returned. Does the agent use the payload? Three patterns kept showing up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;It paraphrases with a number flipped:&lt;/strong&gt; tool returns &lt;code&gt;amount_cents: 4500&lt;/code&gt;, agent says "your refund of $54.00 is processing."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It substitutes prior model knowledge:&lt;/strong&gt; &lt;code&gt;get_account_balance&lt;/code&gt; returns &lt;code&gt;12_400&lt;/code&gt;, model answers from a remembered "$200 threshold" instead.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It uses the result on turn 1, then drifts off it by turn 3:&lt;/strong&gt; quotes the right itinerary, then invents a contradicting baggage policy.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The rubric is Groundedness, except you point the context slot at the tool's return payload instead of a retrieved corpus:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fi.evals&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Evaluator&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fi.evals.templates&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Groundedness&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ContextAdherence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ChunkAttribution&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fi.testcases&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TestCase&lt;/span&gt;

&lt;span class="n"&gt;tc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TestCase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ex&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;evaluator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;eval_templates&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;Groundedness&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nc"&gt;ContextAdherence&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nc"&gt;ChunkAttribution&lt;/span&gt;&lt;span class="p"&gt;()],&lt;/span&gt;
    &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Layer 4: Error recovery
&lt;/h3&gt;

&lt;p&gt;When the tool 4xx-es or times out, the agent's next move is the eval surface. Did it read the error and correct, or resend the same broken string? Fall back when the primary was down? Stop at a sane retry cap (3 is a common floor; 6 usually means the loop guard is missing)? This is trajectory-level, not per-call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fi.evals.metrics.agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TrajectoryScore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AgentTrajectoryInput&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fi.evals.metrics.agents.types&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AgentStep&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TaskDefinition&lt;/span&gt;

&lt;span class="n"&gt;trajectory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentTrajectoryInput&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;trajectory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;AgentStep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_used&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                          &lt;span class="n"&gt;tool_args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_result&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                          &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;agent_steps&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;TaskDefinition&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;goal&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;expected_goal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_request&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;available_tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;registered_tools&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;final_result&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;agent_response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TrajectoryScore&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;compute_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trajectory&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The math that makes all of this non-optional
&lt;/h2&gt;

&lt;p&gt;End-to-end success on a &lt;em&gt;k&lt;/em&gt;-step agent is roughly the product of per-step success rates.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;95% per step over 8 steps lands near &lt;strong&gt;66%&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;99% per step over 8 steps lands near &lt;strong&gt;92%&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Two-thirds of sessions ending structurally wrong while every individual step scores green isn't a hypothetical. It's the default math, and it's the most common reason teams ship agents that pass eval and tank in production.&lt;/p&gt;

&lt;p&gt;The fixes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Score the trajectory as a unit (per-step rubric is the gate, trajectory metric is the truth).&lt;/li&gt;
&lt;li&gt;Treat anything longer than five steps as suspect and decompose it.&lt;/li&gt;
&lt;li&gt;Reserve a &lt;code&gt;pass^k&lt;/code&gt; consistency slice: 30 hard cases run &lt;em&gt;k&lt;/em&gt; times, the fraction that succeed on all &lt;em&gt;k&lt;/em&gt;. When it moves, the planner regressed, not the tools.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I still use public benchmarks for
&lt;/h2&gt;

&lt;p&gt;I didn't throw out BFCL or τ-bench, I just stopped pretending they gate production.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;BFCL&lt;/strong&gt; tells you whether the underlying model can call tools at all (AST, executable, irrelevance).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;τ-bench&lt;/strong&gt; tells you about multi-turn reliability. Even GPT-4o lands below 25% at &lt;code&gt;pass^8&lt;/code&gt; on retail.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both are a model-selection floor. Neither knows anything about your registry, your schemas, your error codes, or your business policy. The private eval set, stratified by tool, argument-edge-case, and error code, with failing production traces promoted in weekly, is the one that gates the ship.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd do differently
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Score per-layer from day one&lt;/strong&gt;, not aggregate task-completion. Five rubrics per case costs more, but when CI fails, the failing layer name &lt;em&gt;is&lt;/em&gt; the root cause.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Treat groundedness-on-tool-output as noisier than on a retrieved corpus.&lt;/strong&gt; Payloads are JSON, the rubric reasons over fields. Pin a small human-labelled calibration set, re-tune monthly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run the &lt;code&gt;pass^k&lt;/code&gt; slice on release candidates, not every PR.&lt;/strong&gt; 30 cases × 8 rollouts is 240 agent runs. Worth it at the right cadence, painful as a per-commit gate.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're running tool-calling agents in production on aggregate task-completion alone, you're flying with one eye closed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Curious about your setup
&lt;/h2&gt;

&lt;p&gt;Anyone else been bitten by the green-everywhere-but-broken trace? Specifically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Do you score arguments semantically, or stop at schema validation?&lt;/li&gt;
&lt;li&gt;Result utilization: are you grounding against the tool payload, or only the retrieved corpus?&lt;/li&gt;
&lt;li&gt;How much do you trust LLM-as-judge for grounding on live production traffic?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Drop a comment, I read all of them. The four-layer stack runs on an open-source eval SDK too, so if you want to get started, say the word and I'll share the link.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>agents</category>
      <category>testing</category>
    </item>
    <item>
      <title>How I Manage All My Claude Code Sessions from a Single Terminal</title>
      <dc:creator>S. Afsan</dc:creator>
      <pubDate>Wed, 03 Jun 2026 06:37:01 +0000</pubDate>
      <link>https://dev.to/writewithafsan/how-i-manage-all-my-claude-code-sessions-from-a-single-terminal-ea5</link>
      <guid>https://dev.to/writewithafsan/how-i-manage-all-my-claude-code-sessions-from-a-single-terminal-ea5</guid>
      <description>&lt;p&gt;I run multiple Claude Code sessions all day — one per feature, one per service, sometimes five at once.&lt;/p&gt;

&lt;p&gt;Every session was asking me for permission in its own terminal. I'd miss requests buried in a background tab. I'd switch windows mid-thought just to approve a &lt;code&gt;git status&lt;/code&gt;. I'd lose context constantly.&lt;/p&gt;

&lt;p&gt;And there was no single place to see what Claude was doing across all of them.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;Gatekeeper&lt;/strong&gt; — a TUI daemon that intercepts every Claude Code tool call and routes it to one unified approval dashboard.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2y01mwv79za0rfncfwen.gif" alt=" " width="720" height="372"&gt;
&lt;/h2&gt;

&lt;h2&gt;
  
  
  The dashboard
&lt;/h2&gt;

&lt;p&gt;Three panes, one terminal:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Left&lt;/strong&gt; — all active Claude sessions, with status badges: &lt;code&gt;[auto]&lt;/code&gt; means auto-approve is on, &lt;code&gt;[linked]&lt;/code&gt; means it's wired to a terminal window&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Middle&lt;/strong&gt; — pending permission requests with an age timer so you know what's been waiting longest&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Right&lt;/strong&gt; — full request detail, danger warnings, and the numbered approval menu&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every Claude Code tool call — &lt;code&gt;Bash&lt;/code&gt;, &lt;code&gt;Edit&lt;/code&gt;, &lt;code&gt;Write&lt;/code&gt;, &lt;code&gt;Agent&lt;/code&gt; — passes through a &lt;code&gt;PreToolUse&lt;/code&gt; hook before executing. The hook connects to Gatekeeper's Unix socket, sends the request, and blocks. Gatekeeper shows it in the UI. When you decide, the answer travels back and Claude proceeds or stops.&lt;/p&gt;




&lt;h2&gt;
  
  
  Approving requests
&lt;/h2&gt;

&lt;p&gt;The menu in the right pane mirrors Claude Code's own style:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1  Allow once
2  Always allow
3  Deny
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;↑&lt;/code&gt;/&lt;code&gt;↓&lt;/code&gt; moves the cursor, &lt;code&gt;Enter&lt;/code&gt; confirms. Or just press &lt;code&gt;1&lt;/code&gt;, &lt;code&gt;2&lt;/code&gt;, &lt;code&gt;3&lt;/code&gt; directly. &lt;code&gt;A&lt;/code&gt; and &lt;code&gt;D&lt;/code&gt; are quick shortcuts for allow/deny.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 2 — always allow&lt;/strong&gt; — is where it gets useful. Choosing it saves a persistent rule so the same request never surfaces again:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Bash&lt;/code&gt; → saves the command pattern (e.g. &lt;code&gt;npm run *&lt;/code&gt;) to config&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Edit&lt;/code&gt; / &lt;code&gt;Write&lt;/code&gt; → saves the directory to an allowlist&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Agent&lt;/code&gt; → enables auto-approve for that session&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The rule is written both to Gatekeeper's own config &lt;em&gt;and&lt;/em&gt; to Claude Code's &lt;code&gt;settings.json&lt;/code&gt; allowlist — so Claude Code itself won't prompt for it either.&lt;/p&gt;




&lt;h2&gt;
  
  
  Auto-approve sessions
&lt;/h2&gt;

&lt;p&gt;Press &lt;code&gt;A&lt;/code&gt; in the Sessions pane to mark a session as trusted. It shows &lt;code&gt;[auto]&lt;/code&gt; — routine tool calls pass silently without appearing in the queue.&lt;/p&gt;

&lt;p&gt;But some things &lt;strong&gt;always&lt;/strong&gt; require manual approval, no matter what:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;What's blocked&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;File deletion&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;rm&lt;/code&gt;, &lt;code&gt;rmdir&lt;/code&gt;, &lt;code&gt;shred&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Remote access&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;ssh&lt;/code&gt;, &lt;code&gt;scp&lt;/code&gt;, &lt;code&gt;rsync&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Privilege escalation&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;sudo&lt;/code&gt;, &lt;code&gt;su&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Destructive git&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;push --force&lt;/code&gt;, &lt;code&gt;reset --hard&lt;/code&gt;, &lt;code&gt;clean -f&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Infrastructure&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;terraform apply/destroy&lt;/code&gt;, &lt;code&gt;kubectl delete&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sensitive paths&lt;/td&gt;
&lt;td&gt;Writes to &lt;code&gt;/etc/&lt;/code&gt;, &lt;code&gt;~/.ssh/&lt;/code&gt;, &lt;code&gt;~/.aws/&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Read-only commands — &lt;code&gt;grep&lt;/code&gt;, &lt;code&gt;find&lt;/code&gt;, &lt;code&gt;ls&lt;/code&gt;, &lt;code&gt;cat&lt;/code&gt;, &lt;code&gt;git status&lt;/code&gt;, &lt;code&gt;npm install&lt;/code&gt; — always pass through freely.&lt;/p&gt;




&lt;h2&gt;
  
  
  Linking sessions to terminals
&lt;/h2&gt;

&lt;p&gt;This is the feature that unlocks everything else.&lt;/p&gt;

&lt;p&gt;Press &lt;code&gt;L&lt;/code&gt; on any session in the Sessions pane. An overlay appears — switch to the Claude terminal tab (alt+tab, click, whatever), and Gatekeeper detects the focus change and links that session to that window automatically. The session shows &lt;code&gt;[linked]&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Links persist across restarts in &lt;code&gt;~/.claude/perm-window-map.json&lt;/code&gt;. You link once, it stays.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sending messages from Gatekeeper
&lt;/h2&gt;

&lt;p&gt;Once a session is linked, press &lt;code&gt;M&lt;/code&gt;, type your message, press &lt;code&gt;Enter&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Gatekeeper injects the text into the linked Claude terminal using X11 XTEST — it appears &lt;strong&gt;and submits automatically&lt;/strong&gt;, exactly as if you typed it and pressed Enter there. You never leave the Gatekeeper terminal.&lt;/p&gt;

&lt;p&gt;This solves a problem I didn't know I had until I built it: Claude pauses mid-task and asks a clarifying question — &lt;code&gt;A / B / C?&lt;/code&gt;. Normally you'd switch to that terminal, answer, switch back. With Gatekeeper, you just press &lt;code&gt;M&lt;/code&gt; and type from wherever you are.&lt;/p&gt;

&lt;p&gt;Useful for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Answering Claude's mid-task questions without switching windows&lt;/li&gt;
&lt;li&gt;Explaining why you denied a request&lt;/li&gt;
&lt;li&gt;Redirecting Claude to a different approach while it waits&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One caveat: injection works when each Claude session is in its own terminal &lt;strong&gt;window&lt;/strong&gt;. If multiple sessions share one window as tabs, they share the same X11 window ID — Gatekeeper can't target a specific tab. Run each session in a new window (&lt;code&gt;kitty&lt;/code&gt;, &lt;code&gt;gnome-terminal --window&lt;/code&gt;, etc.).&lt;/p&gt;




&lt;h2&gt;
  
  
  Settings
&lt;/h2&gt;

&lt;p&gt;Press &lt;code&gt;S&lt;/code&gt; to open the settings panel. From here you can configure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tool types&lt;/strong&gt; — which tools (Bash, Edit, Write, Agent) Gatekeeper intercepts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bash categories&lt;/strong&gt; — how commands are classified (read-only vs. destructive vs. network, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom patterns&lt;/strong&gt; — your own allow/deny rules beyond the defaults&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No config file spelunking. Everything is editable from inside the dashboard.&lt;/p&gt;




&lt;h2&gt;
  
  
  Stats
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gatekeeper stats        &lt;span class="c"&gt;# today&lt;/span&gt;
gatekeeper stats 7      &lt;span class="c"&gt;# last 7 days&lt;/span&gt;
gatekeeper stats all    &lt;span class="c"&gt;# all time&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;====================================================
 GATEKEEPER STATS
====================================================
  Total decisions : 177
  Auto-approved   :  16  (  9%)
  Manual reviewed : 161  ( 90%)
    allowed       : 161
    denied        :   0

  Auto-approved by session:
    b73f7ccc    7 calls
    a8ed1d57    5 calls

  Auto-approved by tool:
    Bash          11
    Edit           5
====================================================
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every decision is logged to &lt;code&gt;~/.claude/perm-logs/YYYY-MM-DD.log&lt;/code&gt;, one file per day, kept indefinitely. Useful for auditing what Claude did across a long session or a whole project.&lt;/p&gt;




&lt;h2&gt;
  
  
  What happens when Gatekeeper isn't running
&lt;/h2&gt;

&lt;p&gt;The hook falls back to a &lt;code&gt;Y/n&lt;/code&gt; prompt in the Claude terminal with a 30-second auto-deny. Nothing hangs, nothing silently passes. You can also set &lt;code&gt;GATEKEEPER_TIMEOUT=0&lt;/code&gt; to always use the terminal prompt for a specific session.&lt;/p&gt;




&lt;h2&gt;
  
  
  How it's wired up
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;install.sh&lt;/code&gt; does four things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Installs wrapper scripts in &lt;code&gt;~/.claude/bin/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Registers the &lt;code&gt;PreToolUse&lt;/code&gt; hook in &lt;code&gt;~/.claude/settings.json&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Adds blanket &lt;code&gt;permissions.allow&lt;/code&gt; rules so Claude Code doesn't double-prompt&lt;/li&gt;
&lt;li&gt;Sets &lt;code&gt;permissions.defaultMode = "bypassPermissions"&lt;/code&gt; — disables Claude Code's built-in dialogs entirely, making Gatekeeper the &lt;strong&gt;sole&lt;/strong&gt; approval gate&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That last point matters: Claude Code's own hardcoded prompts for sensitive paths (&lt;code&gt;/proc/&lt;/code&gt;, &lt;code&gt;/sys/&lt;/code&gt;, &lt;code&gt;~/.bashrc&lt;/code&gt;) are suppressed in &lt;code&gt;bypassPermissions&lt;/code&gt; mode. Gatekeeper handles everything instead.&lt;/p&gt;




&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Btocode/gatekeeper
&lt;span class="nb"&gt;cd &lt;/span&gt;gatekeeper
python3 &lt;span class="nt"&gt;-m&lt;/span&gt; venv .venv &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
bash install.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then open a dedicated terminal and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gatekeeper
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start your Claude Code sessions anywhere — other terminals, VS Code, JetBrains. Every tool call will appear in Gatekeeper.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Requirements:&lt;/strong&gt; Linux + X11 + Python 3.11+&lt;/p&gt;




&lt;h2&gt;
  
  
  Why I built this
&lt;/h2&gt;

&lt;p&gt;I was working on a project with five Claude sessions running in parallel — one per subsystem. Each one was capable. But I was the bottleneck: constantly switching windows to approve &lt;code&gt;npm run build&lt;/code&gt; for the fifth time that hour.&lt;/p&gt;

&lt;p&gt;Gatekeeper changed that. Trusted sessions handle routine calls without interrupting me. Anything new or risky surfaces in the dashboard. I answer Claude's questions without leaving my main terminal. And at the end of the day, &lt;code&gt;gatekeeper stats&lt;/code&gt; tells me exactly what happened.&lt;/p&gt;

&lt;p&gt;It's open source. MIT licensed.&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/Btocode/gatekeeper" rel="noopener noreferrer"&gt;github.com/Btocode/gatekeeper&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you run Claude Code with multiple sessions, give it a try. And if you build tools like this — follow me, more coming.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;claudecode&lt;/code&gt; &lt;code&gt;ai&lt;/code&gt; &lt;code&gt;devtools&lt;/code&gt; &lt;code&gt;opensource&lt;/code&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>claude</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Why Your LLM Agent Gives a Different P-Value Every Time (And What to Build Instead)</title>
      <dc:creator>Cheng Peng</dc:creator>
      <pubDate>Wed, 03 Jun 2026 06:34:28 +0000</pubDate>
      <link>https://dev.to/cheng-peng0718/why-your-llm-agent-gives-a-different-p-value-every-time-and-what-to-build-instead-5dc6</link>
      <guid>https://dev.to/cheng-peng0718/why-your-llm-agent-gives-a-different-p-value-every-time-and-what-to-build-instead-5dc6</guid>
      <description>&lt;p&gt;Hand the same paired before/after dataset (n = 25) to ChatGPT five times. Same prompt: &lt;em&gt;"These are the same subjects measured before and after an intervention. Did their scores change significantly?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Four of the five runs return &lt;code&gt;p = 0.009&lt;/code&gt; from a paired t-test.&lt;/p&gt;

&lt;p&gt;The fifth run does a Shapiro–Wilk normality check on the differences first, decides they're non-normal, switches to a Wilcoxon signed-rank test, and reports &lt;code&gt;p = 0.000018&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzs8mc80s9ty9g2pa61bj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzs8mc80s9ty9g2pa61bj.png" alt=" " width="800" height="518"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvo0f6j7t7rwsnl2gtrlg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvo0f6j7t7rwsnl2gtrlg.png" alt=" " width="800" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;All five reach the same conclusion (significant). But notice what happened: only one run out of five thought to check an assumption you'd want it to check. The other four skipped it. The choice of &lt;em&gt;method&lt;/em&gt; — and the test statistic, and the p-value — depended on whether the LLM happened to run an assumption check that time. On borderline data, this is the difference between reject and don't reject.&lt;/p&gt;

&lt;p&gt;If you're using LLMs for exploratory data analysis on a weekend project, you might shrug. If you're using them for anything that gets cited, gets submitted to a regulator, or gets handed to a clinician, this is a problem. It's a known problem — &lt;a href="https://arxiv.org/abs/2602.14349" rel="noopener noreferrer"&gt;Cui &amp;amp; Alexander (2026)&lt;/a&gt; documented exactly this kind of method-divergence empirically; &lt;a href="https://arxiv.org/abs/2502.16395" rel="noopener noreferrer"&gt;AIRepr (Zeng et al., 2025)&lt;/a&gt; shows the same thing across reproducibility metrics. The current answer in the literature is to &lt;em&gt;constrain&lt;/em&gt; the agent so its execution is replayable. But replayability fixes "did we run the same code." It doesn't fix "did we run the &lt;em&gt;right&lt;/em&gt; analysis."&lt;/p&gt;

&lt;p&gt;I've spent the last two months building a different fix. The more interesting half is the architecture. Let me walk through it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real problem isn't temperature
&lt;/h2&gt;

&lt;p&gt;The first reflex is "set &lt;code&gt;temperature=0&lt;/code&gt;." It's not enough.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;temperature=0&lt;/code&gt; doesn't make a tool-using agent deterministic across runs. Three reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Inference isn't bitwise deterministic, even at temperature=0.&lt;/strong&gt; Production LLM serving batches requests dynamically, and the attention kernels aren't batch-invariant — so the same input produces different output tokens depending on what other requests it gets batched with. &lt;a href="https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/" rel="noopener noreferrer"&gt;Thinking Machines Lab&lt;/a&gt; and &lt;a href="https://www.lmsys.org/blog/2025-09-22-sglang-deterministic/" rel="noopener noreferrer"&gt;SGLang&lt;/a&gt; are still treating this as an active engineering problem in 2026.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plausible methods have no principled tiebreaker.&lt;/strong&gt; When a paired t-test and Wilcoxon signed-rank are both reasonable for a moderate-skew paired sample, there's no rule in the model's weights that says which to pick. It picks based on whichever rationale chain it happened to generate (as in the n=25 example above).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Whether an assumption check is even run is stochastic.&lt;/strong&gt; The same dataset, asked the same question, sometimes triggers a Shapiro–Wilk check and sometimes doesn't. When the check is run, it routes to a non-parametric test; when it isn't, the model defaults to a paired t. The case above is exactly this: one in five runs decided to check, four didn't.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The deeper issue: LLM agents try to do two jobs at once. &lt;em&gt;Choose&lt;/em&gt; which analysis to run, and &lt;em&gt;run&lt;/em&gt; the analysis. The first is a judgment problem the LLM is reasonably good at. The second is a computation problem the LLM is bad at, because it's inherently stochastic and produces results you can't verify by inspection.&lt;/p&gt;

&lt;h2&gt;
  
  
  "Just write the code yourself"
&lt;/h2&gt;

&lt;p&gt;Natural reaction: stop using the LLM for the computation. Write the scipy code yourself.&lt;/p&gt;

&lt;p&gt;This is right — but it throws out the half that's actually useful. When a researcher says &lt;em&gt;"compare the post-treatment scores between cohorts and tell me if the intervention worked,"&lt;/em&gt; the value of the LLM is mapping that informal request to (a) the right columns in the dataframe, (b) the right method given assumptions, (c) the right multiple-comparison correction, (d) a plain-English summary at the end. That mapping is genuinely hard to encode as a fixed program. Throwing the whole LLM out is overcorrecting.&lt;/p&gt;

&lt;p&gt;What you actually want: keep the LLM for the routing decision, but pin the computation to a fixed, validated implementation that &lt;em&gt;cannot&lt;/em&gt; vary across runs.&lt;/p&gt;

&lt;h2&gt;
  
  
  LLM routes; engine computes
&lt;/h2&gt;

&lt;p&gt;That's the architecture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;natural-language request
        │
        ▼
   LLM Supervisor ─────────► chooses ONE next action at a time
        │                    (a tool call, or a final answer)
        ▼
 Deterministic plugin ─────► runs a hardcoded statistical method,
        │                    cross-validated against scipy/statsmodels
        ▼
 Claims ledger + gate ─────► verifies that every reported number came
        │                    from an actual plugin run
        ▼
   Auditable report
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This pattern — let the LLM choose tools, but pin the computation — isn't novel. Variants of it show up in domains as different as &lt;a href="https://dev.to/demianbrecht/stop-asking-llms-to-be-deterministic-e32"&gt;devops automation&lt;/a&gt; and &lt;a href="https://dev.to/nodefiend/trust-the-server-not-the-llm-a-deterministic-approach-to-llm-accuracy-20ag"&gt;financial reporting&lt;/a&gt;. What I think is specific to applying it to statistical inference is the &lt;strong&gt;anti-fabrication discipline&lt;/strong&gt; below: a generic deterministic tool ecosystem still allows the LLM to paraphrase or round the numbers it received. The claims ledger pattern makes that structurally impossible.&lt;/p&gt;

&lt;p&gt;I built this as &lt;a href="https://github.com/Cheng-Peng0718/StatGuard-Agent" rel="noopener noreferrer"&gt;StatGuard Agent&lt;/a&gt;. The supervisor LLM (currently &lt;code&gt;gpt-4o&lt;/code&gt;) picks one of 27 hardcoded analysis plugins per step. The plugins do &lt;em&gt;all&lt;/em&gt; numerical work; the LLM never emits a number. Given the same plugin and the same arguments, the output is byte-identical across runs — the variability that remains is in plugin selection, which is what the validation framework below targets.&lt;/p&gt;

&lt;p&gt;The interesting design choice was not "LLM picks tools" — that's standard agent stuff now. The interesting choice was making sure the LLM never gets to &lt;em&gt;emit a number&lt;/em&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  The piece I'd argue should be standard: a claims ledger
&lt;/h2&gt;

&lt;p&gt;Here's the failure mode I really wanted to prevent. Take the opening example: a paired t-test on the n = 25 dataset returns &lt;code&gt;p = 0.009&lt;/code&gt;. Now the LLM produces a final summary for the user. The most likely failure isn't that the wrong test was chosen — we can catch that in routing tests. The most likely failure is that the LLM, in its summary, writes &lt;code&gt;"p = 0.01"&lt;/code&gt;, or &lt;code&gt;"p &amp;lt; 0.01"&lt;/code&gt;, or hallucinates a confidence interval that nobody computed. Over a multi-step analysis, &lt;em&gt;what got computed&lt;/em&gt; and &lt;em&gt;what got reported&lt;/em&gt; can drift apart silently.&lt;/p&gt;

&lt;p&gt;The pattern that fixes this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every plugin run emits structured &lt;strong&gt;claims&lt;/strong&gt; with stable IDs: &lt;code&gt;claim_42 = {value: 0.009, kind: "p_value", method: "paired_t", n: 25, ...}&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The LLM, during its working session, sees only a &lt;em&gt;list of claim IDs&lt;/em&gt; with their semantic tags ("there is a p-value claim with ID 42"). It does not see the literal numbers in its scratchpad.&lt;/li&gt;
&lt;li&gt;When the LLM emits a final report, it must reference claims by ID: &lt;code&gt;"The intervention shows {claim_42}, suggesting..."&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;A separate, deterministic &lt;strong&gt;render layer&lt;/strong&gt; substitutes claim IDs with the verified text from the original plugin output: &lt;code&gt;"...shows p = 0.009 (paired t-test, n = 25)..."&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result: the LLM cannot insert a number that wasn't computed. It cannot round. It cannot round-trip. It cannot paraphrase a statistic into something subtly different. It can only point at claims. A coverage gate also enforces that every required piece of evidence (for a group comparison: test statistic, p-value, effect size, assumption check) has been produced before a final answer is allowed.&lt;/p&gt;

&lt;p&gt;I'd argue this pattern should be standard for any agent that produces structured numerical output, not just statistics ones. The principle: &lt;strong&gt;LLMs are pointers, not values.&lt;/strong&gt; Numbers, dates, quotes from documents, monetary amounts — anything where "almost right" is wrong — should be produced by a deterministic tool, given a claim ID, and stitched into the final text by a renderer that the LLM cannot touch.&lt;/p&gt;
&lt;h2&gt;
  
  
  How do we actually know it works
&lt;/h2&gt;

&lt;p&gt;Two layers of validation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 1 — plugin carpet benchmark.&lt;/strong&gt; For every plugin, generate scenarios with fixed seeds and known ground truth, then check the plugin's output against an independent &lt;code&gt;scipy&lt;/code&gt;/&lt;code&gt;statsmodels&lt;/code&gt; computation of the same quantity. The current carpet is 362 cases, all passing. This validates the plugins as plugins, with the LLM out of the picture.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 2 — end-to-end agent benchmark.&lt;/strong&gt; Drive the full LLM-supervised pipeline on a representative 42-case subset of the same matrix. Each case is judged on four dimensions: (a) the LLM picked the right plugin (routing), (b) the agent reached a final answer (no-error), (c) the claims ledger is clean — every reported number traceable to a plugin run (honesty), (d) the final numerical output is within tolerance of the ground truth (accuracy). Current pass rate: 42/42 on all four.&lt;/p&gt;

&lt;p&gt;Plus 764 deterministic unit/integration tests for everything else.&lt;/p&gt;

&lt;p&gt;The most useful experience I had was during e2e validation. The first run had 36/38 routing pass — two cases failed because, on prompts framed for FDA submission or audit-grade contexts, the LLM didn't reach for the more rigorous bootstrap mode it should have. That kind of failure isn't a computation bug, it's a &lt;em&gt;judgment&lt;/em&gt; bug — and it only surfaces in an e2e benchmark, not a plugin-layer one. I tightened the plugin's &lt;code&gt;use_when&lt;/code&gt; specification with explicit triggers ("FDA", "audit-grade", "clinical", "third-party re-run"), re-ran, got 38/38. The pattern: e2e benchmarks find specification gaps; plugin benchmarks find code gaps.&lt;/p&gt;
&lt;h2&gt;
  
  
  One feature worth mentioning by name
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;bootstrap_inference&lt;/code&gt; plugin produces confidence intervals for paired-difference statistics under percentile, basic, and BCa methods, all cross-validated against &lt;code&gt;scipy.stats.bootstrap&lt;/code&gt;. It also has an opt-in &lt;strong&gt;Sequential Bootstrap&lt;/strong&gt; mode (&lt;a href="https://arxiv.org/abs/2511.18065" rel="noopener noreferrer"&gt;Peng 2025&lt;/a&gt;) for cases where the bootstrap CI itself needs to be more stable across RNG seeds — regulated submissions, audit reports. Every call emits a cross-seed CI endpoint-stability diagnostic so you can compare the two modes on your data.&lt;/p&gt;
&lt;h2&gt;
  
  
  What this isn't
&lt;/h2&gt;

&lt;p&gt;Up front:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pre-adoption.&lt;/strong&gt; v0.2.0 just dropped. Real-world users are zero or one (you, possibly).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scope is narrow and intentional.&lt;/strong&gt; Standard univariate statistical inference and OLS. No mixed models, no factorial ANOVA yet, no survival analysis, no deep learning. The design philosophy is "reproducible analysis uses validated methods" — so the framework only covers methods I can validate against a reference implementation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Routing is not perfect.&lt;/strong&gt; The LLM still makes routing mistakes; the 42-case e2e benchmark is how we catch them and tighten the plugin specs. New plugins will need new e2e cases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;License: MIT.&lt;/strong&gt; Just install and use.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;Concrete things on the roadmap:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;More plugins.&lt;/strong&gt; Mixed-effects models (LMM / GLMM) for repeated-measures designs. Two-way / factorial ANOVA with interaction effects. Survival analysis (Cox PH, log-rank). Each new plugin gets its own carpet cases and e2e routing cases before merge.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better routing on ambiguous prompts.&lt;/strong&gt; When a user says &lt;em&gt;"compare these groups"&lt;/em&gt; without specifying paired / independent / repeated, the LLM has to infer. The current routing logic is one-shot; I want to add a clarification loop where the agent asks one targeted question rather than guessing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Jupyter cell magic.&lt;/strong&gt; Most data scientists live in notebooks. A &lt;code&gt;%%statguard compare cohort_A vs cohort_B&lt;/code&gt; cell magic returning a reproducible report in the next cell is more useful than the current Streamlit-only entry point.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scale routing to more plugins without bloating the tool-selection context.&lt;/strong&gt; With 27 plugins the tool-description payload is manageable. At 100 plugins it won't be — LLM context fills with metadata that's irrelevant to the current request. Likely path: a two-stage router that first picks a plugin &lt;em&gt;family&lt;/em&gt; (comparison / regression / description / SQL), then picks the specific plugin within that family, halving the per-turn metadata payload.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you build agents that produce structured numerical output and want to talk about the claims-ledger pattern, I'd love to hear from you. If you're a statistician with an opinion on what's missing from the plugin set, file an issue. If you're hiring for ML / data engineering / AI applications roles in the US, I'm currently looking — reach out if you're sourcing.&lt;/p&gt;

&lt;p&gt;The repo:&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/Cheng-Peng0718" rel="noopener noreferrer"&gt;
        Cheng-Peng0718
      &lt;/a&gt; / &lt;a href="https://github.com/Cheng-Peng0718/StatGuard-Agent" rel="noopener noreferrer"&gt;
        StatGuard-Agent
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      An auditable statistical analysis framework pairing LLM orchestration with a deterministic, scipy-cross-validated statistics engine. The LLM routes; the engine computes and self-verifies.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;p&gt;&lt;a href="https://doi.org/10.5281/zenodo.20519404" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/1c0f1e39774a95105d2054e9e2e083e76c7647a88c27e0b25db602cee9617735/68747470733a2f2f7a656e6f646f2e6f72672f62616467652f444f492f31302e353238312f7a656e6f646f2e32303531393430342e737667" alt="DOI"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;StatGuard Agent&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;An auditable statistical analysis framework that pairs LLM orchestration with a deterministic, cross-validated statistics engine.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;StatGuard Agent turns a natural-language analysis request into an end-to-end, reproducible statistical report. It is built on a deliberate separation of concerns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The LLM orchestrates&lt;/strong&gt; — it reads the request, inspects the data, and decides &lt;em&gt;which&lt;/em&gt; analysis to run next.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The deterministic engine computes&lt;/strong&gt; — every statistic is produced by hardcoded, plugin-based methods that are cross-validated against &lt;code&gt;scipy&lt;/code&gt; / &lt;code&gt;statsmodels&lt;/code&gt;, never by the LLM itself.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This division is the core design principle. A general-purpose LLM asked to "compare these groups" may silently pick the wrong test, skip an assumption check, or report a number it did not actually compute — and may do so &lt;em&gt;differently every time it is run&lt;/em&gt;. A traditional tool like SPSS is reproducible but cannot interpret an open-ended request. StatGuard Agent aims for both: &lt;strong&gt;as adaptable as&lt;/strong&gt;…&lt;/p&gt;
&lt;/div&gt;


&lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Cheng-Peng0718/StatGuard-Agent" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;
&lt;br&gt;


&lt;p&gt;Stars, issues, and adversarial test cases all welcome.&lt;/p&gt;

</description>
      <category>python</category>
      <category>llm</category>
      <category>datascience</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Smart Lighting Protocol Showdown: Zigbee vs Matter vs BLE Mesh (2026)</title>
      <dc:creator>lamp nex</dc:creator>
      <pubDate>Wed, 03 Jun 2026 06:34:21 +0000</pubDate>
      <link>https://dev.to/lamp_nex_8cbfdfb5b5aa6b50/smart-lighting-protocol-showdown-zigbee-vs-matter-vs-ble-mesh-2026-4h6j</link>
      <guid>https://dev.to/lamp_nex_8cbfdfb5b5aa6b50/smart-lighting-protocol-showdown-zigbee-vs-matter-vs-ble-mesh-2026-4h6j</guid>
      <description>&lt;h1&gt;
  
  
  Smart Lighting Protocol Showdown: Zigbee vs Matter vs BLE Mesh (2026)
&lt;/h1&gt;

&lt;p&gt;After deploying thousands of Zigbee smart lights through our manufacturing line at nexLAMP, and watching countless customers struggle with protocol selection, I decided to write this practical comparison.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Problem
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;"My smart lights keep disconnecting! I think I chose the wrong protocol..."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is the #1 complaint I see on Reddit, Xiaohongshu, and Zhihu. The fix isn't a better router — it's choosing the right protocol from day one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Protocol Deep Dive
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Zigbee — The Workhorse
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Frequency&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;2.4 GHz (separate from WiFi)&lt;/span&gt;
&lt;span class="na"&gt;Topology&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Star + Mesh hybrid&lt;/span&gt;
&lt;span class="na"&gt;Max devices&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;200+ per coordinator&lt;/span&gt;
&lt;span class="na"&gt;Latency&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;50-200ms&lt;/span&gt;
&lt;span class="na"&gt;Cost/unit&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;~$3.5-5.0 (Tuya Zigbee drivers)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why it wins for lighting:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each node is a repeater → self-healing mesh&lt;/li&gt;
&lt;li&gt;Ultra-low power → years on coin cell for sensors&lt;/li&gt;
&lt;li&gt;Mature ecosystem → Tuya, Hue, Aqara, Xiaomi all ship Zigbee&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The catch:&lt;/strong&gt; You need a Zigbee gateway (~$15-20). This is the only upfront cost.&lt;/p&gt;

&lt;h3&gt;
  
  
  BLE Mesh — The Budget Option
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Frequency&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;2.4 GHz (shared with WiFi/BLE)&lt;/span&gt;
&lt;span class="na"&gt;Topology&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Managed flood mesh&lt;/span&gt;
&lt;span class="na"&gt;Max devices&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;~50 (practical limit ~30)&lt;/span&gt;
&lt;span class="na"&gt;Latency&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;100-500ms (increases with node count)&lt;/span&gt;
&lt;span class="na"&gt;Cost/unit&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;~$2.0-3.5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The flooding problem:&lt;/strong&gt; Every command is broadcast to every node. With N nodes, you get O(N²) message propagation. Past 30 devices, you'll notice visible lag.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Good for:&lt;/strong&gt; Small apartments (≤ 6 lights), budget projects.&lt;/p&gt;

&lt;h3&gt;
  
  
  Matter — The Future
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Transport&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Thread (preferred) or WiFi&lt;/span&gt;
&lt;span class="na"&gt;Topology&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Thread mesh (similar to Zigbee)&lt;/span&gt;
&lt;span class="na"&gt;Max devices&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;250+ (theoretical)&lt;/span&gt;
&lt;span class="na"&gt;Latency&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;30-150ms (Thread), variable (WiFi)&lt;/span&gt;
&lt;span class="na"&gt;Cost/unit&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;~$7.0-11.0 (currently higher)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Matter's promise is genuine cross-platform control. But in 2026:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Native HomeKit, Alexa, Google Home support&lt;/li&gt;
&lt;li&gt;Thread mesh is excellent (when it works)&lt;/li&gt;
&lt;li&gt;IP-based → easier cloud integration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Thread Border Routers aren't ubiquitous yet&lt;/li&gt;
&lt;li&gt;Advanced lighting features still evolving&lt;/li&gt;
&lt;li&gt;Premium pricing for early adoption&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Cost Analysis (20-Fixture Deployment)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Protocol&lt;/th&gt;
&lt;th&gt;Drivers&lt;/th&gt;
&lt;th&gt;Gateway&lt;/th&gt;
&lt;th&gt;Total&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Zigbee&lt;/td&gt;
&lt;td&gt;$70-100&lt;/td&gt;
&lt;td&gt;$15-20&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$85-120&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BLE Mesh&lt;/td&gt;
&lt;td&gt;$40-70&lt;/td&gt;
&lt;td&gt;$0-15&lt;/td&gt;
&lt;td&gt;$40-85&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Matter (Thread)&lt;/td&gt;
&lt;td&gt;$140-220&lt;/td&gt;
&lt;td&gt;$30-55&lt;/td&gt;
&lt;td&gt;$170-275&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Zigbee costs ~$40 more than BLE Mesh for 20 lights. That's $2 per light to never deal with disconnections.&lt;/p&gt;

&lt;h2&gt;
  
  
  Decision Flowchart
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;New construction / whole-home? → Zigbee
Apple ecosystem only? → Matter
Budget &amp;lt; $60 total? → BLE Mesh
Commercial (50+ fixtures)? → Zigbee
OEM product development? → Zigbee (Tuya)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Production Lessons Learned
&lt;/h2&gt;

&lt;p&gt;At nexLAMP, we standardized on Tuya Zigbee for three reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;OTA firmware updates&lt;/strong&gt; — Critical for long-term maintenance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Binding/grouping&lt;/strong&gt; — Lights can work without gateway after binding&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ecosystem bridge&lt;/strong&gt; — Tuya gateway bridges Zigbee to Alexa, Google, HomeKit, Mijia&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;90% of smart lighting users are best served by Zigbee. It's the protocol that "just works" at scale — and when you're dealing with lights in your ceiling, "just works" is the only acceptable answer.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Written by the nexLAMP engineering team. We manufacture Tuya Zigbee smart lighting fixtures for global markets. Questions? Drop a comment below.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>smartlighting</category>
      <category>zigbee</category>
      <category>matter</category>
      <category>iot</category>
    </item>
    <item>
      <title>#javascript #apnacollege #webdev #beginners</title>
      <dc:creator>Ali Hamza</dc:creator>
      <pubDate>Wed, 03 Jun 2026 06:33:36 +0000</pubDate>
      <link>https://dev.to/ali_hamza_589ec7b3eb6688d/javascript-apnacollege-webdev-beginners-366f</link>
      <guid>https://dev.to/ali_hamza_589ec7b3eb6688d/javascript-apnacollege-webdev-beginners-366f</guid>
      <description>&lt;p&gt;Hello Dev Community! 👋&lt;/p&gt;

&lt;p&gt;It is officially &lt;strong&gt;Day 12&lt;/strong&gt; of my journey to master the MERN stack! Today, I wrapped up &lt;strong&gt;Lecture 3 of Apna College's JavaScript playlist&lt;/strong&gt; with Shradha Didi, focusing on a fundamental data type we use every day: &lt;strong&gt;Strings&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Before today, I thought strings were just plain text wrapped in quotes. Today, I learned how much power JavaScript gives us to manipulate, slice, and dynamically format text.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧠 Key Learnings From JS Lecture 3 (Strings)
&lt;/h2&gt;

&lt;p&gt;I explored how JavaScript handles text strings and the built-in properties and methods that make text manipulation effortless:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Template Literals (The Ultimate Game Changer)
&lt;/h3&gt;

&lt;p&gt;Shradha Didi introduced &lt;strong&gt;Template Literals&lt;/strong&gt;, which use backticks (&lt;code&gt;`&lt;/code&gt;) instead of standard quotes. This allows us to perform &lt;strong&gt;String Interpolation&lt;/strong&gt;—embedding variables directly inside a string using &lt;code&gt;${variable}&lt;/code&gt;. It makes code look clean and professional:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
javascript
let obj = { item: "pen", price: 10 };
// Old way: console.log("The cost of", obj.item, "is", obj.price, "rupees.");
// Modern way:
console.log(`The cost of ${obj.item} is ${obj.price} rupees.`);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>beginners</category>
      <category>devjournal</category>
      <category>javascript</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Nonprofit Seeks Cost-Effective Website Alternatives to $15,000 Wix Solution for Complex Features</title>
      <dc:creator>Maxim Gerasimov</dc:creator>
      <pubDate>Wed, 03 Jun 2026 06:32:34 +0000</pubDate>
      <link>https://dev.to/maxgeris/nonprofit-seeks-cost-effective-website-alternatives-to-15000-wix-solution-for-complex-features-443c</link>
      <guid>https://dev.to/maxgeris/nonprofit-seeks-cost-effective-website-alternatives-to-15000-wix-solution-for-complex-features-443c</guid>
      <description>&lt;h2&gt;
  
  
  The $15K Wix Dilemma: Why Nonprofits Should Think Twice
&lt;/h2&gt;

&lt;p&gt;A nonprofit employee recently raised a red flag: their organization is considering a $15,000 Wix website to handle complex features like event management, volunteer tracking, an online shop, donor management, and blogs. The employee, skeptical of the price tag and Wix’s suitability, is now tasked with convincing management—who lack technical expertise—to reconsider. This scenario highlights a critical issue: &lt;strong&gt;nonprofits risk overspending on platforms ill-equipped for their needs, leading to long-term inefficiencies and wasted resources.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here’s the core problem: Wix is a drag-and-drop website builder designed for simplicity, not complexity. While it’s user-friendly for basic sites, it &lt;strong&gt;struggles to scale for advanced functionalities&lt;/strong&gt; like integrated donor management or robust event systems. The $15,000 quote likely reflects inflated costs for customizations that push Wix beyond its intended capabilities. This mismatch between platform limitations and organizational needs creates a &lt;em&gt;risk cascade&lt;/em&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Technical Debt:&lt;/strong&gt; Over-customizing Wix introduces &lt;em&gt;brittle code&lt;/em&gt;—quick fixes that break under updates or increased traffic. For example, adding a donor management system might require third-party integrations that &lt;em&gt;deform Wix’s backend structure&lt;/em&gt;, leading to slow load times or data sync failures.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability Failure:&lt;/strong&gt; Wix’s infrastructure is optimized for small-scale use. As the nonprofit grows, the site will &lt;em&gt;heat up under load&lt;/em&gt;, causing crashes during high-traffic events like fundraising campaigns.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vendor Lock-in:&lt;/strong&gt; Heavy customizations tie the nonprofit to Wix, limiting future migration. If the platform fails to meet needs, the organization faces a &lt;em&gt;break point&lt;/em&gt;: rebuild from scratch or accept subpar performance.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Management’s desperation to update the website after decades of neglect, combined with their lack of technical knowledge, makes them vulnerable to overpriced solutions. The vendor likely exploited this gap, &lt;em&gt;expanding the scope&lt;/em&gt; of the project to justify the cost. For instance, a simple blog could be bundled with unnecessary features, while critical systems like donor management are &lt;em&gt;patched together&lt;/em&gt; instead of built on a robust framework.&lt;/p&gt;

&lt;p&gt;To address this, the nonprofit should:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Audit Actual Needs:&lt;/strong&gt; Identify core vs. optional features. For example, is a full e-commerce shop necessary, or can donations and merchandise sales be handled through simpler tools?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Explore Open-Source Alternatives:&lt;/strong&gt; Platforms like WordPress with plugins like GiveWP (for donations) or Event Espresso (for events) offer &lt;em&gt;modular scalability&lt;/em&gt; at a fraction of the cost. These systems are designed to &lt;em&gt;expand without breaking&lt;/em&gt; under added functionalities.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Seek Expert Consultation:&lt;/strong&gt; A neutral developer can assess the $15,000 quote and propose cost-effective solutions. For instance, a custom-built site on a Laravel or Django framework might cost $20,000 upfront but &lt;em&gt;outperform Wix in longevity and efficiency&lt;/em&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The rule here is clear: &lt;strong&gt;If a nonprofit requires complex, scalable features, avoid Wix.&lt;/strong&gt; Its drag-and-drop simplicity is a &lt;em&gt;mechanical illusion&lt;/em&gt; that fails under pressure. Instead, invest in a solution tailored to long-term growth, even if it requires a higher initial cost. The alternative is a $15,000 website that &lt;em&gt;deforms under its own weight&lt;/em&gt;, leaving the organization worse off than before.&lt;/p&gt;

&lt;h2&gt;
  
  
  Breaking Down the Costs: Wix vs. Alternatives
&lt;/h2&gt;

&lt;p&gt;The $15,000 quote for a Wix-based website is a red flag, not just because of the price tag, but because of the &lt;strong&gt;fundamental mismatch between Wix’s capabilities and the nonprofit’s complex needs&lt;/strong&gt;. Let’s dissect the costs, risks, and alternatives to show why this is a losing proposition—and what to do instead.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Wix Fails at $15K: The Technical Breakdown
&lt;/h3&gt;

&lt;p&gt;Wix is a &lt;em&gt;drag-and-drop builder&lt;/em&gt;, designed for simplicity, not complexity. When you try to force it to handle advanced features like event management, donor tracking, and e-commerce, the platform &lt;strong&gt;deforms under the weight of customizations&lt;/strong&gt;. Here’s how:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Backend Overload:&lt;/strong&gt; Wix’s backend is not built for heavy data processing. Adding custom event management or donor tracking requires &lt;em&gt;patching its limited database structure&lt;/em&gt;, leading to &lt;strong&gt;slow load times&lt;/strong&gt; and &lt;em&gt;data sync failures&lt;/em&gt; as the system struggles to process requests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Brittle Code:&lt;/strong&gt; Customizations often rely on &lt;em&gt;Wix’s proprietary code&lt;/em&gt;, which &lt;strong&gt;breaks during platform updates&lt;/strong&gt;. This creates &lt;em&gt;technical debt&lt;/em&gt;, forcing constant fixes and limiting future scalability.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability Collapse:&lt;/strong&gt; Wix’s infrastructure is &lt;em&gt;optimized for small-scale sites&lt;/em&gt;. During high-traffic events (e.g., fundraising campaigns), the server &lt;strong&gt;overheats metaphorically&lt;/strong&gt;, causing &lt;em&gt;crashes or downtime&lt;/em&gt;—exactly when the nonprofit needs reliability most.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At $15K, you’re paying a premium for a &lt;strong&gt;brittle, over-customized Wix site&lt;/strong&gt; that will fail under pressure. The vendor is exploiting management’s lack of technical knowledge to bundle unnecessary features while ignoring critical infrastructure needs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost-Effective Alternatives: A Comparative Analysis
&lt;/h3&gt;

&lt;p&gt;Here’s how Wix stacks up against viable alternatives, with a focus on &lt;strong&gt;cost, scalability, and long-term efficiency&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;WordPress with Plugins:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Cost:&lt;/em&gt; $3,000–$8,000 (depending on customization)&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Mechanism:&lt;/em&gt; WordPress is &lt;strong&gt;modular&lt;/strong&gt;, allowing plugins like GiveWP (donations), Event Espresso (events), and WooCommerce (e-commerce) to integrate seamlessly. Unlike Wix, WordPress’s &lt;em&gt;open-source backend&lt;/em&gt; handles complex data processing without deforming, ensuring &lt;strong&gt;faster load times&lt;/strong&gt; and &lt;em&gt;scalable infrastructure&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Edge Case:&lt;/em&gt; If the nonprofit expects rapid growth (e.g., 10x traffic in 2 years), WordPress’s &lt;strong&gt;cloud-based hosting&lt;/strong&gt; can scale horizontally, while Wix’s fixed infrastructure would crash.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Custom Development (Laravel/Django):&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Cost:&lt;/em&gt; $10,000–$25,000&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Mechanism:&lt;/em&gt; Custom frameworks like Laravel or Django are &lt;strong&gt;built from the ground up&lt;/strong&gt; to handle complex features. Their &lt;em&gt;robust backend architecture&lt;/em&gt; prevents data bottlenecks, and their &lt;strong&gt;modular design&lt;/strong&gt; allows for future expansions without breaking existing systems.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Edge Case:&lt;/em&gt; If the nonprofit needs &lt;em&gt;unique donor tracking algorithms&lt;/em&gt; or &lt;em&gt;AI-driven event recommendations&lt;/em&gt;, custom development is the only option. Wix cannot handle such complexity without failing.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Specialized Nonprofit Platforms (e.g., NeonCRM, Kindful):&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Cost:&lt;/em&gt; $5,000–$12,000&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Mechanism:&lt;/em&gt; These platforms are &lt;strong&gt;pre-built for nonprofits&lt;/strong&gt;, with features like donor management, event tracking, and volunteer coordination already integrated. Their &lt;em&gt;optimized workflows&lt;/em&gt; reduce development time and costs compared to custom solutions.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Edge Case:&lt;/em&gt; If the nonprofit relies heavily on &lt;em&gt;automated donor communications&lt;/em&gt;, specialized platforms offer &lt;strong&gt;pre-configured email sequences&lt;/strong&gt;, while Wix would require costly custom coding.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Decision Dominance: The Optimal Solution
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Rule:&lt;/strong&gt; &lt;em&gt;If a nonprofit requires complex, scalable features, avoid Wix. Invest in WordPress with plugins for cost-effectiveness, or custom development for unique needs.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Here’s why:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;WordPress Wins for Most Nonprofits:&lt;/strong&gt; It balances cost ($3K–$8K) and functionality, with plugins that &lt;strong&gt;scale as the organization grows&lt;/strong&gt;. Its open-source nature prevents vendor lock-in, unlike Wix.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom Development for Edge Cases:&lt;/strong&gt; If the nonprofit has &lt;em&gt;unique requirements&lt;/em&gt; (e.g., AI integrations), custom frameworks are optimal—despite higher upfront costs, they &lt;strong&gt;save money long-term&lt;/strong&gt; by avoiding technical debt.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Avoid Wix at All Costs:&lt;/strong&gt; Its limitations create a &lt;em&gt;risk cascade&lt;/em&gt;: technical debt, scalability failure, and vendor lock-in. At $15K, it’s a &lt;strong&gt;waste of resources&lt;/strong&gt; that will require a rebuild within 2–3 years.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Convincing Management: Practical Insights
&lt;/h3&gt;

&lt;p&gt;To steer management away from Wix, focus on &lt;strong&gt;tangible risks and long-term savings&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Highlight Wix’s Limitations:&lt;/strong&gt; Explain how its &lt;em&gt;drag-and-drop simplicity&lt;/em&gt; becomes a &lt;strong&gt;liability under pressure&lt;/strong&gt;, using examples like server crashes during fundraising campaigns.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quantify Cost Savings:&lt;/strong&gt; Show how WordPress or specialized platforms deliver the same features for &lt;strong&gt;half the price&lt;/strong&gt; ($7K vs. $15K) without compromising scalability.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bring in Expert Validation:&lt;/strong&gt; Consult a web developer to audit the Wix quote and expose its &lt;em&gt;over-customization risks&lt;/em&gt;. Use their assessment to build credibility with management.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By framing the decision as a &lt;strong&gt;choice between short-term desperation and long-term sustainability&lt;/strong&gt;, you can guide management toward a solution that aligns with the nonprofit’s mission—without wasting $15,000 on a platform destined to fail.&lt;/p&gt;

&lt;h2&gt;
  
  
  Feature Feasibility: Can Wix Handle the Complexity?
&lt;/h2&gt;

&lt;p&gt;The nonprofit’s $15,000 Wix proposal raises a critical question: &lt;strong&gt;Can Wix’s drag-and-drop simplicity support complex features like event management, volunteer tracking, and e-commerce without collapsing under pressure?&lt;/strong&gt; The answer lies in Wix’s technical architecture and its physical limitations when pushed beyond small-scale use cases.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wix’s Breaking Points: A Mechanical Breakdown
&lt;/h2&gt;

&lt;p&gt;Wix’s backend is a &lt;em&gt;proprietary, closed-source system&lt;/em&gt; optimized for static, low-traffic sites. When forced to handle dynamic, data-heavy features like event registrations or donor tracking, the following failures occur:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Database Overload:&lt;/strong&gt; Wix’s database structure is not designed for heavy write operations (e.g., simultaneous event sign-ups). This causes &lt;em&gt;query bottlenecks&lt;/em&gt;, where the database server’s CPU spikes, leading to &lt;strong&gt;5-10x slower load times&lt;/strong&gt; during peak usage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Brittle Custom Code:&lt;/strong&gt; Adding complex features requires &lt;em&gt;Wix Velo custom code&lt;/em&gt;, which hooks into Wix’s proprietary framework. These hooks &lt;em&gt;break during platform updates&lt;/em&gt;, as Wix’s internal APIs change without backward compatibility. Result: &lt;strong&gt;Technical debt accumulates&lt;/strong&gt;, requiring constant rewrites.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability Collapse:&lt;/strong&gt; Wix’s infrastructure is &lt;em&gt;vertically scaled&lt;/em&gt;, meaning it cannot horizontally distribute traffic across servers. During high-traffic events (e.g., fundraising campaigns), the single server &lt;em&gt;reaches 100% CPU/memory usage&lt;/em&gt;, triggering &lt;strong&gt;503 errors or site crashes&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Edge-Case Analysis: Where Wix Fails
&lt;/h2&gt;

&lt;p&gt;Consider a &lt;em&gt;24-hour fundraising event&lt;/em&gt; with 5,000 simultaneous users. Wix’s infrastructure would:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Hit &lt;strong&gt;database read/write limits&lt;/strong&gt;, causing donation processing delays (impact: lost revenue).&lt;/li&gt;
&lt;li&gt;Trigger &lt;em&gt;server overheating&lt;/em&gt; due to sustained CPU load, forcing Wix’s auto-scaling to throttle requests (observable effect: users see “Site Unavailable” messages).&lt;/li&gt;
&lt;li&gt;Corrupt session data due to &lt;em&gt;memory leaks in custom Velo code&lt;/em&gt;, requiring a full site restart (risk mechanism: unsanitized user inputs in event registration forms).&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Alternatives: Mechanisms and Dominance
&lt;/h2&gt;

&lt;p&gt;Three alternatives outperform Wix by addressing its core failures:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Solution&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Mechanism&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Dominance Condition&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WordPress + Plugins&lt;/td&gt;
&lt;td&gt;Open-source backend with &lt;em&gt;horizontal scaling&lt;/em&gt; via cloud hosting (e.g., AWS). Plugins like GiveWP use &lt;em&gt;optimized SQL queries&lt;/em&gt; to prevent database bottlenecks.&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Optimal for 80% of nonprofits.&lt;/strong&gt; Fails only if requiring &lt;em&gt;custom AI/ML features&lt;/em&gt; (e.g., predictive donor analytics).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Custom Development (Laravel/Django)&lt;/td&gt;
&lt;td&gt;Modular microservices architecture. Each feature (e.g., event management) runs on a &lt;em&gt;separate containerized service&lt;/em&gt;, preventing single points of failure.&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Optimal for unique needs.&lt;/strong&gt; Overkill if features are standard (e.g., basic e-commerce).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Specialized Platforms (NeonCRM)&lt;/td&gt;
&lt;td&gt;Pre-built nonprofit workflows. Uses &lt;em&gt;pre-optimized database schemas&lt;/em&gt; for donor/event data, reducing development time by 70%.&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Optimal for time-sensitive launches.&lt;/strong&gt; Limited customization compared to WordPress/custom builds.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Convincing Management: Practical Insights
&lt;/h2&gt;

&lt;p&gt;To counter Wix’s appeal, use these evidence-backed arguments:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Quantify Risk:&lt;/strong&gt; “Wix’s proprietary backend will &lt;em&gt;break during updates&lt;/em&gt;, requiring $5,000/year in emergency fixes. WordPress plugins auto-update without conflicts.”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Expose Hidden Costs:&lt;/strong&gt; “The $15,000 Wix quote includes &lt;em&gt;brittle custom code&lt;/em&gt; that’ll cost $10,000 to replace in 3 years. WordPress delivers the same features for $6,000 upfront.”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Leverage Expert Validation:&lt;/strong&gt; “Web developers avoid Wix for complex sites due to &lt;em&gt;server crash risks&lt;/em&gt;. Here’s a case study where a similar nonprofit rebuilt their Wix site after 18 months.”&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Decision Rule: If X, Use Y
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;If your nonprofit requires complex, scalable features (e.g., event management + e-commerce), avoid Wix.&lt;/strong&gt; Its simplicity creates technical debt and scalability failures. Instead:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;WordPress with plugins&lt;/strong&gt; if features are standard and budget is under $10,000.&lt;/li&gt;
&lt;li&gt;Choose &lt;strong&gt;custom development&lt;/strong&gt; if unique features are required (e.g., AI-driven donor insights).&lt;/li&gt;
&lt;li&gt;Opt for &lt;strong&gt;specialized platforms&lt;/strong&gt; if launching within 3 months is critical.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Wix’s $15,000 proposal is a textbook example of vendor exploitation. By understanding its mechanical failures, you can steer management toward solutions that won’t crumble under real-world usage.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Recommendations and Next Steps
&lt;/h2&gt;

&lt;p&gt;Your nonprofit is at a critical juncture: invest wisely in a website that scales with your mission or risk pouring $15,000 into a Wix solution that will buckle under pressure. Here’s a step-by-step plan to avoid technical debt, vendor lock-in, and long-term inefficiencies.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Audit Your Needs: Separate Core from Optional Features
&lt;/h3&gt;

&lt;p&gt;Wix vendors often bundle unnecessary features to inflate costs. &lt;strong&gt;Distinguish must-haves from nice-to-haves&lt;/strong&gt;. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Core Features:&lt;/strong&gt; Event management, donor tracking, basic e-commerce.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optional Features:&lt;/strong&gt; AI-driven recommendations, custom donor dashboards.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Mechanism:&lt;/em&gt; Overloading Wix with optional features forces developers to write brittle custom code, which &lt;strong&gt;deforms the backend structure&lt;/strong&gt;, causing data sync failures and slow load times. By stripping down to essentials, you reduce technical debt and lower costs.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Explore Cost-Effective Alternatives
&lt;/h3&gt;

&lt;p&gt;Wix’s $15,000 quote is a red flag. Here’s how alternatives stack up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;WordPress + Plugins ($3K–$8K):&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Mechanism:&lt;/em&gt; Open-source backend with plugins like GiveWP and WooCommerce &lt;strong&gt;horizontally scales&lt;/strong&gt; on cloud hosting, preventing server crashes during high-traffic events.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Edge Case:&lt;/em&gt; Handles 5,000+ simultaneous users without CPU/memory overload, unlike Wix’s vertically scaled infrastructure.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Specialized Nonprofit Platforms (NeonCRM, $5K–$12K):&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Mechanism:&lt;/em&gt; Pre-optimized database schemas for donor management &lt;strong&gt;reduce query bottlenecks&lt;/strong&gt;, ensuring faster processing during campaigns.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Edge Case:&lt;/em&gt; Automated email sequences cut development time by 70%, ideal for time-sensitive launches.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Custom Development (Laravel/Django, $10K–$25K):&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Mechanism:&lt;/em&gt; Modular microservices architecture &lt;strong&gt;eliminates single points of failure&lt;/strong&gt;, critical for unique features like AI-driven insights.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Edge Case:&lt;/em&gt; Overkill for standard features; only use if WordPress plugins cannot meet specific needs.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Quantify Risks and Hidden Costs
&lt;/h3&gt;

&lt;p&gt;Present management with hard numbers to counter Wix’s appeal:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Technical Debt:&lt;/strong&gt; Wix’s brittle custom code requires &lt;strong&gt;$5,000/year in emergency fixes&lt;/strong&gt; due to API changes breaking the backend.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability Failure:&lt;/strong&gt; Wix crashes under 5,000+ users, causing &lt;strong&gt;503 errors&lt;/strong&gt; and lost donations during peak campaigns.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vendor Lock-In:&lt;/strong&gt; Migrating from Wix after heavy customizations costs &lt;strong&gt;$10,000+ to rebuild&lt;/strong&gt;, as proprietary code is non-transferable.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Leverage Expert Validation
&lt;/h3&gt;

&lt;p&gt;Developers avoid Wix for complex sites due to its &lt;strong&gt;proprietary backend limitations&lt;/strong&gt;. Share case studies of nonprofits forced to rebuild Wix sites within 18 months due to scalability failures. Highlight how WordPress or specialized platforms deliver the same features for &lt;strong&gt;half the cost&lt;/strong&gt; without technical debt.&lt;/p&gt;

&lt;h3&gt;
  
  
  Decision Rule: If X, Use Y
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;If your nonprofit needs standard features under $10,000:&lt;/strong&gt; Use &lt;strong&gt;WordPress + plugins&lt;/strong&gt; for scalability and cost-effectiveness.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If you require unique features (e.g., AI-driven insights):&lt;/strong&gt; Invest in &lt;strong&gt;custom development&lt;/strong&gt; to avoid long-term inefficiencies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If time is critical (3-month launch):&lt;/strong&gt; Opt for &lt;strong&gt;specialized platforms&lt;/strong&gt; like NeonCRM to minimize development time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If Wix is proposed:&lt;/strong&gt; &lt;strong&gt;Reject it&lt;/strong&gt; for complex, scalable features due to technical debt and vendor lock-in risks.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Practical Next Steps
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Request Detailed Quotes:&lt;/strong&gt; Ask Wix vendors to break down costs. Challenge over-customizations that push Wix beyond its capabilities.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consult an Independent Developer:&lt;/strong&gt; Have a third-party expert audit the Wix proposal to expose hidden risks and overpricing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pilot a WordPress Solution:&lt;/strong&gt; Start with a $5,000 WordPress site to test functionality. Scale up with plugins as needed.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;By following these steps, your nonprofit can avoid the Wix trap and build a website that grows with your mission—not against it.&lt;/p&gt;

</description>
      <category>nonprofit</category>
      <category>wix</category>
      <category>scalability</category>
      <category>alternatives</category>
    </item>
  </channel>
</rss>
